Remote operation of a vehicle using virtual representations of a vehicle state

ABSTRACT

In various examples, at least partial control of a vehicle may be transferred to a control system remote from the vehicle. Sensor data may be received from a sensor(s) of the vehicle and the sensor data may be encoded to generate encoded sensor data. The encoded sensor data may be transmitted to the control system for display on a virtual reality headset of the control system. Control data may be received by the vehicle and from the control system that may be representative of a control input(s) from the control system, and actuation by an actuation component(s) of the vehicle may be caused based on the control input.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. Pat. Application No.17/379,691, filed Jul. 19, 2021, which is a continuation of U.S. Pat.Application No. 16/366,506, filed Mar. 27, 2019, which claims thebenefit of U.S. Provisional Application No. 62/648,493, filed on Mar.27, 2018. Each of these applications is incorporated herein by referencein its entirety.

BACKGROUND

As autonomous vehicles become more prevalent and rely less on directhuman control, the autonomous vehicles may be required to navigateenvironments or situations that are unknown to them. For example,navigating around pieces of debris in the road, navigating around anaccident, crossing into oncoming lanes when a lane of the autonomousvehicle is blocked, navigating through unknown environments orlocations, and/or navigating other situations or scenarios may not bepossible using the underlying systems of the autonomous vehicles whilestill maintaining a desired level of safety and/or efficacy.

Some autonomous vehicles, such as those capable of operation atautonomous driving levels 3 or 4 (as defined by the Society ofAutomotive Engineers (SAE) “Taxonomy and Definitions for Terms Relatedto Driving Automation Systems for On-Road Motor Vehicles”), includecontrols for a human operator. As such, conventional approaches tohandling the above described situations or scenarios have includedhanding control back to a passenger of the vehicle. (e.g., a driver).

However, for autonomous vehicles of autonomous driving level 5, theremay not be a driver, or controls for a driver, so it may not be possibleto pass control to a passenger of the autonomous vehicle (or a passengermay be unfit to drive). As another example, the autonomous vehicle maynot include passengers (e.g., an empty robo-taxi), or may not be largeenough to hold passengers, so control of the autonomous vehicles may becompletely self-contained.

Some conventional approaches have provided some level of remote controlof autonomous vehicles by using a two-dimensional (2D) visualizationsprojected onto 2D displays, such as computer monitors or televisiondisplays. For example, the 2D display(s) at a remote operator’s positionmay display image data (e.g., a video stream(s)) generated by acamera(s) of the autonomous vehicle to the remote operator, and theremote operator may control the autonomous vehicle using controlcomponents of a computer, such as a keyboard, mouse, joystick, and/orthe like.

However, using only a 2D visualization on a 2D display(s) may notprovide enough immersion or information for the remote operator tocontrol the autonomous vehicle as safely as desired. For example, theremote operator may not gain an intuitive or natural sense of locationsof other objects in the environment relative to the autonomous vehicleby looking at a 2D visualization on s 2D display(s). In addition,providing control of an autonomous vehicle from a remote location usinggeneric computer components (e.g., keyboard, mouse, joystick, etc.) maynot lend itself to natural control of the autonomous vehicle (e.g., as asteering wheel, brake, accelerator, and/or other vehicle componentswould). For example, a correlation (or scale) between inputs to akeyboard (e.g., a left arrow selection) and control of the autonomousvehicle (e.g., turning to the left) may not be known, such that smoothoperation may not be achievable (e.g., operation that may make thepassengers feel comfortable). Further, by providing only a 2Dvisualization, valuable information related to the state of theautonomous vehicle may not be presentable to the remote operator in aneasily digestible format, such as the angle of the wheels, the currentposition of the steering wheel, and/or the like.

SUMMARY

Embodiments of the present disclosure relate to remote control ofautonomous vehicles. More specifically, systems and methods aredisclosed that relate to transferring at least partial control of theautonomous vehicle and/or another object to a remote control system toallow the remote control system to aid the autonomous vehicle and/orother object in navigating an environment.

In contrast to conventional systems, such as those described above, thesystems of the present disclosure leverage virtual reality (VR)technology to generate an immersive virtual environment for display to aremote operator. For example, a remote operator (e.g., a human, a robot,etc.) may have at least partial control of the vehicle or other object(e.g., a robot, an unmanned aerial vehicle (UAV), etc.), and may providecontrols for the vehicle or other object using a remote control system.Sensor data from the vehicle or other object may be sent from thevehicle or the other object to the remote control system, and the remotecontrol system may generate and render a virtual environment for displayusing a VR system (e.g., on a display of a VR headset). The remoteoperator (e.g., a human, a robot, etc.) may provide controls to acontrol component(s) of the remote control system to control a virtualrepresentation of the vehicle or other object in the virtualenvironment. The controls from the remote control system may then besent (e.g., after encoding, scaling, etc.) to the vehicle or otherobject, and the vehicle or other object may execute controls that arebased on the controls from the remote control system.

As a result, a vehicle or other object that may have previously beenunable to navigate certain environments, situations, or scenarios (e.g.,due to restrictions, rules, etc.), may be controlled, at leastpartially, through the environments, situations, or scenarios based oncontrols from the remote operator. Thus, instead of coming to a stop orshutting down, the vehicle or other object may be able to navigate thesituation and then continue according a planned path (e.g., byreentering an autonomous mode). By navigating the situation rather thanstopping or shutting down, the vehicle or other object is able tominimize the impact with respect to the scheduled travel of the ego-carand to other vehicles or objects in the environment and/or can avoidcreating an unsafe situation (e.g., by stopping or shutting down on aroadway or in another environment), thereby increasing safety within theenvironment as well. In addition, because the controls of the remotecontrol system may translate more seamlessly to the vehicle controls(e.g., because the remote control system may include a steering wheel, abrake, and an accelerator), and due to the immersive nature of thevirtual environment, the remote operator may be able to navigate thevehicle or other object through the environment more safely andefficiently than conventional systems.

BRIEF DESCRIPTION OF THE DRAWINGS

The present systems and methods for remote control of autonomousvehicles is described in detail below with reference to the attacheddrawing figures, wherein:

FIG. 1A is an illustration of a system for remote control of anautonomous vehicle, in accordance with some embodiments of the presentdisclosure;

FIG. 1B is another illustration of a system for remote control of anautonomous vehicle, in accordance with some embodiments of the presentdisclosure;

FIG. 2A is an illustration of an example virtual environment, inaccordance with some embodiments of the present disclosure;

FIG. 2B is another illustration of an example virtual environment, inaccordance with some embodiments of the present disclosure;

FIG. 3A is an example flow diagram for a method of remote control of anautonomous vehicle, in accordance with some embodiments of the presentdisclosure;

FIG. 3B is an example flow diagram for a method of remote control of anautonomous vehicle, in accordance with some embodiments of the presentdisclosure;

FIG. 4 is an example signal flow diagram for a method of remote controlof an autonomous vehicle, in accordance with some embodiments of thepresent disclosure;

FIG. 5A is an example data flow diagram illustrating a process fortraining an autonomous vehicle using a machine learning model(s), inaccordance with some embodiments of the present disclosure;

FIG. 5B is an example illustration of a machine learning model(s) fortraining an autonomous vehicle according to the process of FIG. 5A, inaccordance with some embodiments of the present disclosure;

FIG. 6 is an example flow diagram for a method of training an autonomousvehicle using a machine learning model(s), in accordance with someembodiments of the present disclosure;

FIG. 7A is an illustration of an example autonomous vehicle, inaccordance with some embodiments of the present disclosure;

FIG. 7B is an example of camera locations and fields of view for theexample autonomous vehicle of FIG. 7A, in accordance with someembodiments of the present disclosure;

FIG. 7C is a block diagram of an example system architecture for theexample autonomous vehicle of FIG. 7A, in accordance with someembodiments of the present disclosure;

FIG. 7D is a system diagram for communication between cloud-basedserver(s) and the example autonomous vehicle of FIG. 7A, in accordancewith some embodiments of the present disclosure; and

FIG. 8 is a block diagram of an example computing device suitable foruse in implementing some embodiments of the present disclosure.

DETAILED DESCRIPTION

Systems and methods are disclosed related to remote control ofautonomous vehicles. The present disclosure may be described withrespect to an example autonomous vehicle 102 (alternatively referred toherein as “vehicle 102” or “autonomous vehicle 102”), an example ofwhich is described in more detail herein with respect to FIGS. 7A-7D.However, this is not intended to be limiting. For example, and withoutdeparting from the scope of the present disclosure, the systems,methods, and/or processes described herein may be applicable tonon-autonomous vehicles, robots, unmanned aerial vehicles, and/or anyother type of vehicle or object configured for remote control inaddition to, or alternatively from, the autonomous vehicle 102. Inaddition, although the present disclosure may be described with respectto an autonomous vehicle control system 100, this is not intended to belimiting, and the methods and processes described herein may beimplemented on systems including additional or alternative structures,components, and/or architectures without departing from the scope of thepresent disclosure.

Conventional systems that aim to provide some level of control of anautonomous vehicle and/or other object from a remote location may do sousing an entirely two-dimensional (2D) visualization presented on a 2Ddisplay, such as a computer monitor a television display. For example,one or more computer monitors may be used to display a video streamedfrom a camera of a vehicle, and a remote operator may control thevehicle using a keyboard, mouse, joystick, or other generic computercomponents. However, using a 2D visualization on non-immersive 2Ddisplay(s) (e.g., computer monitors, television displays, etc.) may notprovide enough immersion or information for the remote operator tocontrol the vehicle or other object as safely as desired. For example,the remote operator may not gain a strong sense of locations of otherobjects in the environment relative to the vehicle or other object bylooking at a 2D visualizations displayed on 2D displays. In addition,providing control of a vehicle from a remote location using genericcomputer components (e.g., keyboard, mouse, joystick, etc.) may not lenditself to natural control of a vehicle (e.g., as a steering wheel,brake, accelerator, and/or other vehicle components would). For example,a correlation (or scale) between inputs to a keyboard (e.g., a leftarrow selection) and control of the vehicle (e.g., turning to the left)may not be known, such that smooth operation (e.g., operation that maymake the passengers feel comfortable) may not be achievable. Further, byproviding only 2D visualizations, valuable information related to thestate of the vehicle may not be presentable to the user in an easilydigestible format, such as the angle of the wheels, the current positionof the steering wheel, and/or the like.

In contrast to conventional systems, the present system may leveragevirtual reality (VR) technology to generate an immersive virtualenvironment for display to a remote operator using a VR system (e.g.,displaying the immersive virtual environment on a VR headset, or adisplay thereof, of the VR system). In some examples, a remote operatormay be transferred at least partial control of the vehicle or otherobject in response to a determination (e.g., by the autonomous vehicleor other object) that the vehicle or object cannot or should not (e.g.,based on rules, conditions, constraints, etc.) navigate a situation orenvironment (e.g., debris blocking a safe path, rules of the roadprevent the vehicle from proceeding a certain way, a dangerous conditionhas presented itself, such as a fallen tree or power line, etc.).

Sensor data (e.g., from cameras, LIDAR sensors, RADAR sensors,microphones, etc.) representative of fields of view of the sensors ofthe vehicle or object may be generated and transmitted to a controlsystem (e.g., the system used by the remote operator). In some examples,at least some of the sensor data (e.g., image data), prior totransmission, may be encoded into a format (e.g., H.264, H.265, AV1,VP9, etc.) that is less data intensive than the format of the sensordata at generation (e.g., raw sensor data). This may have the benefit ofminimizing network requirements in order to efficiently transmit thedata in real-time. In some examples, the vehicle or object may includemultiple modems and/or be capable of communicating across multiplenetwork types (e.g., for redundancy), in order to ensure consistentoperation.

In addition to the sensor data, vehicle state data (e.g., wheel angle,steering wheel angle, location, gear (PRND), tire pressure, etc.)representative of a state of the vehicle and/or calibration data (e.g.,steering sensitivity, braking sensitivity, acceleration sensitivity,etc.) may be transmitted to the control system. The control system mayuse the sensor data, the vehicle state data, and/or the calibration datato generate a virtual environment and/or to calibrate the controlcomponents of the control system (e.g., a steering component, a brakingcomponent, an acceleration component, etc.). With respect to the controlcomponents, the control system may calibrate the control components tocorrespond to the control components of the vehicle. For example, thesteering component (e.g., a steering wheel) may be calibrated to thesensitivity of the steering wheel of the vehicle and/or the startingrotation of the steering wheel may be calibrated to correspond to therotation of the steering wheel of the vehicle. Similarly, the brakingand acceleration components may be calibrated.

In some examples, such as where the vehicle is of a different scale thanthe scale of the virtual vehicle, or has different control mechanisms,the control components of the control system may be calibrated (e.g.,scaled) to match that of the vehicle. For example, where the vehicle isone-fifth scale (e.g., one-fifth scale with respect to the remotecontrol components), the control inputs to the control components of thecontrol system may be downscaled to correlate to the scale of thevehicle.

The virtual environment may be generated in a variety of ways. In anyexamples, the virtual environment may include a display of video streamsfrom one or more of the cameras of the vehicle. In some examples, thedisplay may be on display screens (e.g., virtual display screens) withinthe virtual environment (e.g., NVIDIA’s HOLODECK), while in otherexamples, the display may be from a vantage point or perspective withina cockpit of a virtual vehicle (e.g., creating an immersive view of thesurrounding environment of the vehicle from within the virtual vehicle)to simulate a real-world view when sitting in the vehicle in thephysical environment. In any example, the field of view of the remoteoperator may be from any vantage point or perspective (e.g., outside ofthe vehicle, beside the vehicle, above the vehicle, within the vehicle,etc.).

Using the cockpit example, the field of view may be more consistent withwhat a driver would see in a driver’s seat of the vehicle. In such anexample, at least some of the structure of the vehicle may be removedfrom the rendering, or presented at least partially transparent ortranslucent (e.g., portions of the vehicle that occlude the field ofview out of the vehicle), while other components not normally visible toa driver may be included (e.g., the wheels at their current angle in theenvironment). Vehicle data may be used to generate a virtual simulationof the vehicle, or of another vehicle, that may include a virtualinstrument panel, dashboard, HMI display, controls (e.g., blinkers,etc.), and/or other features and functionality of a vehicle (e.g., ifthe vehicle is of Make X and model Y, the virtual environment mayinclude a virtual representation of the vehicle of Make X and Model Y).

The remote operator may use a view of the virtual environment and/or thecontrol components of the control system to control the vehicle in thephysical environment. For example, the remote operator may steer,accelerate, and/or brake using the control system, the controls may betransmitted to the vehicle, and the vehicle may use the controls toexecute one or more actuations using actuation component(s) of thevehicle. In some examples, the controls input by the remote operator maybe used by the vehicle one-to-one, while in other examples, the controlsmay be used as suggestions (or high level control commands). Forexample, a user may provide steering inputs, braking inputs, and/oracceleration inputs, and a control unit in the vehicle may analyze thecontrols to determine how to most effectively execute them. As anotherexample, the user may provide waypoints for the vehicle (e.g., indicatepoints in the virtual environment that the virtual vehicle shouldnavigate to, and the virtual points may be transmitted to real-worldpoints for the vehicle to navigate to). In any example, the vehicle maymaintain control at an obstacle avoidance level, such that any controlsfrom the control system may not be executed (or may be altered) if thevehicle determines that a collision may result. In such examples, thevehicle may implement a safety procedure -- such as, without limitation,coming to a complete stop -- when a collision is determined to be likelyor imminent.

In some examples, the transmission of data between the vehicle and thecontrol system may be via one or more networks, such as cellularnetworks (e.g., 5G), Wi-Fi, and/or other network types. When possible,the vehicle may transmit all sensor data used by the control system togenerate the virtual environment and/or calibrate the controlcomponents. However, in situations where the network connection strengthis low (e.g., below a threshold), or to otherwise limit bandwidth usage,the vehicle may send only the minimum data required to enable safe andeffective operation of the vehicle. For example, the sensors that are tocontinue to transmit data to the control system may be determined basedon an orientation of a virtual reality headset worn by the remoteoperator. In such examples, the sensors having fields of view thatcorrespond to the virtual field of view of the remote operator based onthe current orientation may be sent (or data corresponding to portionsof the fields of view thereof). For example, if a remote operator islooking forward, at least some of the sensor data from the sensors withfields of view to the rear of the vehicle may not be transmitted.Similarly, if the remote operator looks toward the virtual rear-viewmirror in the virtual environment (e.g., as based on eye gaze or eyetracking information), at least some of the sensor data from the sensorsto the rear of the vehicle (e.g., that are used to render a view on thevirtual rear-view mirror) may be transmitted, while at least some sensordata from sensors with fields of view to the side and/or front of thevehicle may not.

In some examples, the controls implemented by the remote operator and/orthe corresponding sensor data may be applied to machine learningmodel(s) to train the machine learning models on how to navigate unknownor uncertain situations or scenarios represented by the sensor data. Forexample, if a vehicle was operating in a previously unexplored location,and experienced a situation that has not yet been experienced (e.g., dueto the unique environment), the remote operator may take over control ofthe vehicle and control the vehicle through the situation. The sensordata and/or the controls from the navigating through the situation maythen be used to train a neural network (e.g., for use in automaticallycontrolling the vehicle in similar future situations).

With reference to FIGS. 1A-1B, FIGS. 1A-1B are block diagrams of anexample autonomous vehicle control system 100, in accordance with someembodiments of the present disclosure. It should be understood that thisand other arrangements described herein are set forth only as examples.Other arrangements and elements (e.g., machines, interfaces, functions,orders, groupings of functions, etc.) may be used in addition to orinstead of those shown, and some elements may be omitted altogether.Further, many of the elements described herein are functional entitiesthat may be implemented as discrete or distributed components or inconjunction with other components, and in any suitable combination andlocation. Various functions described herein as being performed byentities may be carried out by hardware, firmware, and/or software. Forinstance, various functions may be carried out by a processor executinginstructions stored in memory.

The illustration of FIG. 1A may represent a more generalizedillustration of the autonomous vehicle control system 100 as compared tothe illustration of FIG. 1B. The components, features, and/orfunctionality of the autonomous vehicle 102 described with respect toFIGS. 1A-1B may be implemented using the features, components, and/orfunctionality described in more detail herein with respect to FIGS.7A-7D. In addition, as described herein, the components, features,and/or functionality of the autonomous vehicle 102 and/or remote controlsystem 106 described with respect to FIGS. 1A-1B may be implementedusing the features, components, and/or functionality described in moredetail herein with respect to example computing device 800 of FIG. 8 .

The autonomous vehicle control system 100 may include an autonomousvehicle 102, one or more networks 104, and a remote control system 106.The autonomous vehicle 102 may include a drive stack 108, sensors 110,and/or vehicle controls 112. The drive stack 108 may represent anautonomous driving software stack, as described in more detail hereinwith respect to FIG. 1B. The sensor(s) 110 may include any number ofsensors of the vehicle 102, including, with reference to FIGS. 7A-7D,global navigation satellite system (GNSS) sensor(s) 758, RADAR sensor(s)760, ultrasonic sensor(s) 762, LIDAR sensor(s) 764, inertial measurementunit (IMU) sensor(s) 766, microphone(s) 796, stereo camera(s) 768,wide-view camera(s) 770, infrared camera(s) 772, surround camera(s) 774,long range and/or mid-range camera(s) 798, and/or other sensor types.The sensor(s) 110 may generate sensor data (e.g., image data)representing a field(s) of view of the sensor(s) 110.

For example, the sensor data may represent a field of view of each of anumber of cameras of the vehicle 102. In some examples, the sensor datamay be generated from any number of cameras that may provide arepresentation of substantially 360 degrees around the vehicle 102(e.g., fields of view that extend substantially parallel to a groundplane). In such an example, the fields of view may include a left sideof the vehicle 102, a rear of the vehicle 102, a front of the vehicle102, and/or a side of the vehicle 102. The sensor data may further begenerated to include fields of view above and/or below the vehicle 102(e.g., of the ground or driving surface around the vehicle 102 and/or ofthe space above the vehicle 102). In some examples, the sensor data maybe generated to include blind spots of the vehicle 102 (e.g., usingwing-mirror mounted camera(s)). As another example, the sensor data maybe generated from some or all of the camera(s) illustrated in FIG. 7B.As such, the sensor data generated by the vehicle 102 may include sensordata from any number of sensors without departing from the scope of thepresent disclosure.

With reference to FIG. 1A, an image 146 may include a representation ofsensor data (e.g., image data) generated from a front-facing camera ofthe vehicle 102. The image 146 may include a two-way, solid-line 150divided street 148, such that the vehicle 102, when following the rulesof the road, may not be allowed to cross the solid-line 150 to pass avehicle or object in the lane of the vehicle 102. In the image 146, avan 152 may be stopped in the lane of the vehicle 102 to unload boxes154, so the vehicle 102 may have come to a stop a safe distance behindthe van 152. By following the constraints of the vehicle 102 (e.g., dueto the rules of the road), the vehicle 102 may, without the features andfunctionality of the present disclosure, remain stopped behind the van152 until the van 152 moves (or may pass control to a human operator,depending on the embodiment). However, in the current autonomous vehiclecontrol system 100, the vehicle 102 may determine, in response toencountering the situation represented in the image 146, to transfer atleast partial control to the remote control system 106. In otherexamples, the determination to transfer the control of the vehicle 102(e.g., to initiate a remote control session) may be made by the remoteoperator (or otherwise may be made at the remote control system 106); bya passenger of the vehicle 102 (e.g., using a command or signal, such asa voice command, an input to a user interface element, a selection of aphysical button, etc.); and/or by another actor. For example, sensordata may be analyzed at the remote control system 106 (and/or by anothersystem remote from the vehicle 102) and may be used to determine whethera remote control session should be initiated.

Although the situation represented in FIG. 1A includes a van 152blocking the lane of the vehicle 102, this is not intended to belimiting. For example, any number of situations, scenarios, and/orenvironments, including but not limited to those described herein, maylead to a determination by the vehicle 102 to transfer at least partialcontrol to the remote control system 106 without departing from thescope of the present disclosure. In other examples, the determinationmay be made by the remote control system 106 to take over control of thevehicle 102. In any examples, proper consent may be obtained from theowner and/or operator of the vehicle 102 in order to enable takeover bythe remote operator of the remote control system 106.

In addition to the image 146, the vehicle 102 may also captureadditional sensor data from additional sensors 110 of the vehicle 102,such as from a side-view camera(s), a rear-view camera(s), a surroundcamera(s), a wing-mirror mounted camera(s), a roof-mounted camera(s),parking camera(s) (e.g., with a field(s) of view of the ground surfacearound the vehicle 102), LIDAR sensor(s), RADAR sensor(s),microphone(s), etc. The sensor data generated by the sensor(s) 110 maybe transmitted over the network(s) 104 to the remote control system 106.In some examples, the sensor(s) 110 may generate the sensor data in afirst format (e.g., a raw format) that may be of a first data size. Inorder to minimize bandwidth requirements, the sensor data may be encodedin a second format that may be of a second data size less than the firstdata size (e.g., to decrease the amount of data being sent over thenetwork(s) 104).

In addition to the sensor data that may be used to generate arepresentation of the environment of the vehicle 102, vehicle state data(e.g., representative of the state of the vehicle 102) and/orcalibration data (e.g., for calibrating the remote control(s) 118according to the vehicle control(s) 112) may also be transmitted overthe network(s) 104 to the remote control system 106. For example, thevehicle state data and/or the calibration data may be determined usingone or more sensors 110 of the vehicle 102, such as the steeringsensor(s) 740, speed sensor(s) 744, brake sensor(s), IMU sensor(s) 766,GNSS sensor(s) 758, and/or other sensors 110. The vehicle state data mayinclude wheel angles, steering wheel angle, location, gear (e.g., Park,Reverse, Neutral, Drive (PRND)), tire pressure, speed, velocity,orientation, etc. The calibration data may include steering sensitivity,braking sensitivity, acceleration sensitivity, etc. In some examples,the calibration data may be determined based on a make, model, or typeof the vehicle 102. This information may be encoded in the calibrationdata by the vehicle 102 and/or may be determined by the remote controlsystem 106, such as by accessing one or more data stores (e.g., afterdetermining identification information for the vehicle 102).

The sensor data, the vehicle state data, and/or the calibration data maybe received by the remote control system 106 over the network(s) 104.The network(s) 104 may include one or more network types, such ascellular networks (e.g., 5G, 4G, LTE, etc.), Wi-Fi networks (e.g., whereaccessible), low power wide-area networks (LPWANs) (e.g., LoRaWAN,SigFox, etc.), and/or other network types. In some examples, the vehicle102 may include one or more modems and/or one or more antennas forredundancy and/or for communicating over different network typesdepending on network availability.

The remote control system 106 may include a virtual environmentgenerator 114, a VR headset 116, and a remote control(s) 118. Thevirtual environment generator 114 may use the sensor data, the vehiclestate data, and/or the calibration data to generate a virtualenvironment that may represent the environment (e.g., the real-world orphysical environment, such as the ground surface, the vehicles, thepeople or animals, the buildings, the objects, etc.) in the field(s) ofview of the sensor(s) 110 of the vehicle 102 (e.g., the camera(s), theLIDAR sensor(s), the RADAR sensor(s), etc.), as well as represent atleast a portion of the vehicle 102 (e.g., an interior, an exterior,components, features, displays, instrument panels, etc.) and/or controlsof the vehicle 102 (e.g., a virtual steering wheel, a virtual brakepedal, a virtual gas pedal, a virtual blinker, a virtual HMI display,etc.). In some examples, the virtual environment may include virtualrepresentations of portions of the vehicle 102 that may not be visibleto a driver or passenger of the vehicle 102 in the real-worldenvironment, such as the wheels at an angle (e.g., corresponding to theangle of the wheels of the vehicle 102 in the real-world environment asdetermined by the vehicle state data and/or the calibration data), whichmay be viewable from within a virtual cockpit of the virtual vehicle bymaking one or more other components of the virtual vehicle fullytransparent, semitransparent (e.g., translucent), or removed from therendering altogether..

The virtual environment may be generated from any number of vantagepoints of a remote operator. As non-limiting examples, the virtualenvironment may be generated from a vantage point within a driver’s seatof the virtual vehicle (e.g., as illustrated in FIG. 2B); from withinanother location within the virtual vehicle; and from a position outsideof the virtual vehicle (e.g., as illustrated in FIG. 2A), such as on topof the virtual vehicle, to the side of the virtual vehicle, behind thevirtual vehicle, above the virtual vehicle, etc. In some examples, theremote operator may be able to select from any number of differentvantage points and/or may be able to transition between differentvantage points, even in the same remote control session. For example,the remote operator may start a remote control session from a firstvantage point inside the cockpit of the virtual vehicle (e.g., in thedriver’s seat), and then, when navigating through a tight space oraround an obstacle, may transition to a second vantage point outside ofthe virtual vehicle where the relationship between the tight space orthe obstacle and the virtual vehicle may be more clearly visualized. Inany example, the desired vantage point of the remote operator may beselectable within the remote control system. The remote operator may beable to set defaults or preferences with respect to vantage points.

The remote operator may be able to set defaults and/or preferences withrespect to other information in the virtual environment, such as therepresentations of information that the remote operator would like tohave available within the virtual environment, or more specifically withrespect to the virtual vehicle in the virtual environment (e.g., theremote operator may select which features of the instrument panel shouldbe populated, what should be displayed on a virtual HMI display, whichportions of the vehicle should be transparent and/or removed, what colorthe virtual vehicle should be, what color the interior should be, etc.).As such, the remote operator may be able to generate a custom version ofthe virtual vehicle within the virtual environment. In any example, evenwhere the virtual vehicle is not the same year, make, model, and/or typeas the vehicle 102 in the real-world environment, the virtual vehiclemay be scaled to occupy a substantially similar amount of space in thevirtual environment as the vehicle 102 in the real-world environment. Assuch, even when the virtual vehicle is of a different size or shape asthe vehicle 102, the representation of the virtual vehicle may provide amore direct visualization to the remote operator of the amount of spacethe vehicle 102 occupies in the real-world environment.

In other examples, the virtual vehicle may be generated according to theyear, make, model, type, and/or other information of the vehicle 102 inthe real-world environment (e.g., if the vehicle 102 is a Year N (e.g.2019), Make X, and Model Y, the virtual vehicle may represent a vehiclewith the dimensions, and steering/driving profiles consistent with aYear N, Make X, Model Y vehicle). In such examples, the remote operatormay still be able to customize the virtual vehicle, such as by removingor making transparent certain features, changing a color, changing aninterior design, etc., but, in some examples, may not be able tocustomize the general shape or size of the vehicle.

The virtual environment (e.g., virtual environment 156) may be renderedand displayed on a display of the VR headset 116 of the remote operator(e.g., remote operator 158). The virtual environment 156 may represent avirtual vehicle - that may correspond to the vehicle 102 - from avantage point of the driver’s seat. The virtual environment 156 mayinclude a representation of what a passenger of the vehicle 102 may seewhen sitting in the driver’s seat. The camera(s) or other sensor(s) 110may not capture the sensor data from the same perspective of a passengeror driver of the vehicle. As a result, in order to generate the virtualenvironment 156 (or other virtual environments where the vantage pointdoes not directly correspond to a field(s) of view of the sensor(s)),the sensor data may be manipulated. For example, the sensor data may bedistorted or warped, prior to displaying the rendering on the display ofthe VR headset 116. In some examples, distorting or warping the sensordata may include performing a fisheye reduction technique on one more ofthe sensor data feeds (e.g., video feeds from one or more camera(s)). Inother examples, distorting or warping the sensor data may includeexecuting a positional warp technique to adjust a vantage point of asensor data feed to a desired vantage point. In such an example, such aswhere a camera(s) is roof-mounted on the vehicle 102, a positional warptechnique may be used to adjust, or bring down, the image data feed fromroof-level of the camera(s) to eye-level of a virtual driver of thevirtual vehicle (e.g., the remote operator).

In examples, the sensor data may be manipulated in order to blend orstitch sensor data corresponding to different fields of view ofdifferent sensors. For example, two or more sensors may be used togenerate the representation of the environment (e.g., a first camerawith a first field of view to the front of the vehicle 102, a secondcamera with a second field of view to a left side of the vehicle 102,and so on). In such examples, image or video stitching techniques may beused to stitch together or combine sensor data, such as images or video,to generate a field of view (e.g., 360 degrees) for the remote operatorwith virtually seamless transitions between fields of view representedby the different sensor data from different sensors 110. In one or moreexample embodiments, the sensor data may be manipulated and presented tothe remote operator in a 3D visualization (e.g., stereoscopically). Forexample, one or more stereo cameras 768 of the vehicle 102 may generateimages, and the images may be used (e.g., using one or more neuralnetworks, using photometric consistency, etc.) to determine depth (e.g.,along a Z-axis) for portions of the real-world environment thatcorrespond to the images. As such, the 3D visualization may be generatedusing the stereoscopic depth information from the stereo cameras 768. Inother examples, the depth information may be generated using LIDARsensors, RADAR sensors, and/or other sensors of the vehicle 102. In anyexample, the depth information may be leveraged to generate the 3Dvisualization for display or presentation to the remote operator withinthe virtual environment. In such examples, some or all of rendering ordisplay of the virtual environment to the remote operator may include a3D visualization.

In some examples, because the vehicle 102 may be an autonomous vehiclecapable of operating at autonomous driving level 5 (e.g., fullyautonomous driving), the vehicle 102 may not include a steering wheel.However, even in such examples, the virtual vehicle may include thesteering wheel 160 (e.g., in a position relative to a driver’s seat, ifthe vehicle 102 had a driver’s seat) in order to provide the remoteoperator 158 a natural point of reference for controlling the virtualvehicle. In addition to the steering wheel 160, the interior of thevirtual vehicle may include a rear-view mirror 164 (which may berendered to display image data representative of a field(s) of view of arear-facing camera(s)), wing mirrors (which may be rendered to displayimage data representative of field(s) of view of side-view camera(s),wing-mounted camera(s), etc.), a virtual HMI display 162, door handles,doors, a roof, a sunroof, seats, consoles, and/or other portions of thevirtual vehicle (e.g., based on default settings, based on preferencesof the remote operator, and/or preferences of another user(s) of theremote control system 106, etc.).

As described herein, at least some of the portions of the virtualvehicle may be made at least partially transparent and/or be removedfrom the virtual environment. An example is support column 166 of thevehicle chassis being at least partially transparent and/or removed fromthe virtual vehicle, such that objects and the surface in the virtualenvironment are not occluded or at least less occluded by the supportcolumn 166. Examples of the virtual environment 156 are described inmore detail herein with respect to FIG. 2B.

The instance of the virtual environment 156 in FIG. 1A (andcorrespondingly, FIG. 2B) may represent a time that the image 146 wascaptured by the vehicle 102, and thus may include, as viewed through awindshield of the virtual vehicle, virtual representations of the van152, the boxes 154, the street 148, and the solid-line 150. In someexamples, such as in the virtual environment 156, each of the virtualobjects in the virtual environment may be rendered relative to thevirtual vehicle to correspond to the relative location of the objects inthe real-world environment with respect to the vehicle 102 (e.g., usingdepth information from the sensor data). The virtual representations ofthe image data may include the images or video from the image data,rendered within the virtual environment. As described herein, thevirtual environment may be rendered from any of a number of differentvantage points (including those illustrated in FIGS. 2A-2B), and thevirtual environment 156 is only one, non-limiting example of a virtualenvironment.

The remote operator 158 may use the remote control(s) 118 to control thevirtual vehicle in the virtual environment. The remote control(s) 118may include a steering wheel 168 (or other control(s) for providingsteering inputs, such as keyboards, joysticks, handheld controllers,etc.), an acceleration component 170 (which may be a physical pedal asillustrated in FIG. 1A, or may be a keyboard, a joystick, a handheldcontroller, a button, etc.), a braking component 172 (which may be aphysical pedal as illustrated in FIG. 1A, or may be a keyboard, ajoystick, a handheld controller, a button, etc.), and/or other controlcomponents, such as blinker actuators (which may be physical levers, ormay be controlled using a keyboard, a joystick, a handheld controller,voice, etc.), a horn, light actuators (such as a button, lever, or knobfor turning on and off lights, including driving lights, fog lights,high-beams, etc.), etc.

In some examples, the remote control(s) may include pointers (e.g.,controllers or other objects) that may be used to indicate or identify alocation in the environment that the virtual vehicle should navigate to.In such examples, the remote control(s) 118 may be used to provide inputto the vehicle 102 as to where in the real-world environment the vehicle102 should navigate, and the vehicle 102 may use this information togenerate controls for navigating to the location. For example, withrespect to the image 146, the remote operator 158 may point to alocation in the lane to the left of the vehicle 102 and the van 152,such that the vehicle 102 is able to use the information to override therules of the road that have stopped the vehicle from passing the van152, and to proceed to the adjacent lane in order to pass the van 152and the boxes 154. More detail is provided herein for control inputtypes with respect to FIG. 1B.

In any example, the remote operator 158 may control the virtual vehiclethrough the virtual environment 156, and the control inputs to theremote control(s) 118 may be captured. Control data representative ofeach of the control inputs (e.g., as they are received by the remotecontrol system 106) may be transmitted to the vehicle 102 over thenetwork(s) 104. In some examples, as described in more detail herein,the control data may be encoded by the remote control system 106 priorto transmission and/or may be encoded upon receipt by the vehicle 102.The encoding may be to convert the control data from the remote controlsystem 106 to vehicle control data suitable for use by the vehicle 102.The control data may be scaled, undergo a format change, and/or otherencoding may be executed to convert the control data to vehicle controldata that the vehicle 102 understands and can execute. As a result, asthe remote operator 158 controls the virtual vehicle through the virtualenvironment, the vehicle 102 may be controlled through the real-worldenvironment accordingly. With respect to the image 146 and the virtualenvironment 156, the remote operator 158 may control the virtual vehicleto navigate around the virtual representation of the van 152 by enteringthe adjacent lane of the street 148 to the left of the van 152, passingthe van 152, and then reentering the original lane. Responsive to theinput controls from the remote operator 158, the vehicle 102 may, atsubstantially the same time, navigate around the van 152 by entering theadjacent lane of the street 148 in the real-world environment,proceeding past the van 152, and then reentering the original lane ofthe street 148.

In some examples, such as depending on the preferences of the ownerand/or operator of the vehicle 102, a remote control session may besubstantially seamless to any passengers of the vehicle 102, such thatthe passengers may not be made aware or notice the transfer of controlto the remote control system 106 and then back to the vehicle 102. Inother examples, further depending on the preferences of the owner and/oroperator, the passengers of the vehicle may be informed prior to and/orduring the time when the control is passed to the remote control system106. For example, the remote control system 106 may include amicrophone(s) and/or a speaker(s) (e.g., headphones, standalonespeakers, etc.), and the vehicle 102 may include a microphone(s) and/ora speaker(s), such that one-way or two-way communication may take placebetween the passengers and the remote operator 158. In such examples,once control is passed back to the vehicle 102, the passengers may againbe made aware of the transition.

Now referring to FIG. 1B, FIG. 1B may include a more detailedillustration of the autonomous vehicle control system 100 of FIG. 1A.The autonomous vehicle 102 may include the drive stack 108, which mayinclude a sensor manager 120, perception component(s) 122 (e.g.,corresponding to a perception layer of the drive stack 108), a worldmodel manager 124, planning component(s) 126 (e.g., corresponding to aplanning layer of the drive stack 108), control component(s) 128 (e.g.,corresponding to a control layer of the drive stack 108), obstacleavoidance component(s) (e.g., corresponding to an obstacle or collisionavoidance layer of the drive stack 108), actuation component(s) 132(e.g., corresponding to an actuation layer of the drive stack 108),and/or other components corresponding to additional and/or alternativelayers of the drive stack 108.

The sensor manager 120 may manage and/or abstract sensor data fromsensors 110 of the vehicle 102. For example, and with reference to FIG.7C, the sensor data may be generated (e.g., perpetually, at intervals,based on certain conditions) by global navigation satellite system(GNSS) sensor(s) 758, RADAR sensor(s) 760, ultrasonic sensor(s) 762,LIDAR sensor(s) 764, inertial measurement unit (IMU) sensor(s) 766,microphone(s) 796, stereo camera(s) 768, wide-view camera(s) 770,infrared camera(s) 772, surround camera(s) 774, long range and/ormid-range camera(s) 798, and/or other sensor types.

The sensor manager 120 may receive the sensor data from the sensors indifferent formats (e.g., sensors of the same type, such as LIDARsensors, may output sensor data in different formats), and may beconfigured to convert the different formats to a uniform format (e.g.,for each sensor of the same type). As a result, other components,features, and/or functionality of the autonomous vehicle 102 may use theuniform format, thereby simplifying processing of the sensor data. Insome examples, the sensor manager 120 may use a uniform format to applycontrol back to the sensors of the vehicle 102, such as to set framerates or to perform video gain control. The sensor manager 120 may alsoupdate sensor packets or communications corresponding to the sensor datawith timestamps to help inform processing of the sensor data by variouscomponents, features, and functionality of the autonomous vehiclecontrol system 100.

A world model manager 124 may be used to generate, update, and/or definea world model. The world model manager 124 may use information generatedby and received from the perception component(s) 122 of the drive stack108. The perception component(s) 122 may include an obstacle perceiver,a path perceiver, a wait perceiver, a map perceiver, and/or otherperception component(s) 122. For example, the world model may bedefined, at least in part, based on affordances for obstacles, paths,and wait conditions that can be perceived in real-time or near real-timeby the obstacle perceiver, the path perceiver, the wait perceiver,and/or the map perceiver. The world model manager 124 may continuallyupdate the world model based on newly generated and/or received inputs(e.g., data) from the obstacle perceiver, the path perceiver, the waitperceiver, the map perceiver, and/or other components of the autonomousvehicle control system 100.

The world model may be used to help inform planning component(s) 126,control component(s) 128, obstacle avoidance component(s) 130, and/oractuation component(s) 132 of the drive stack 108. The obstacleperceiver may perform obstacle perception that may be based on where thevehicle 102 is allowed to drive or is capable of driving, and how fastthe vehicle 102 can drive without colliding with an obstacle (e.g., anobject, such as a structure, entity, vehicle, etc.) that is sensed bythe sensors 110 of the vehicle 102.

The path perceiver may perform path perception, such as by perceivingnominal paths that are available in a particular situation. In someexamples, the path perceiver may further take into account lane changesfor path perception. A lane graph may represent the path or pathsavailable to the vehicle 102, and may be as simple as a single path on ahighway on-ramp. In some examples, the lane graph may include paths to adesired lane and/or may indicate available changes down the highway (orother road type), or may include nearby lanes, lane changes, forks,turns, cloverleaf interchanges, merges, and/or other information.

The wait perceiver may be responsible to determining constraints on thevehicle 102 as a result of rules, conventions, and/or practicalconsiderations. For example, the rules, conventions, and/or practicalconsiderations may be in relation to traffic lights, multi-way stops,yields, merges, toll booths, gates, police or other emergency personnel,road workers, stopped busses or other vehicles, one-way bridgearbitrations, ferry entrances, etc. In some examples, the wait perceivermay be responsible for determining longitudinal constraints on thevehicle 102 that require the vehicle to wait or slow down until somecondition is true. In some examples, wait conditions arise frompotential obstacles, such as crossing traffic in an intersection, thatmay not be perceivable by direct sensing by the obstacle perceiver, forexample (e.g., by using sensor data from the sensors 110, because theobstacles may be occluded from field of views of the sensors 110). As aresult, the wait perceiver may provide situational awareness byresolving the danger of obstacles that are not always immediatelyperceivable through rules and conventions that can be perceived and/orlearned. Thus, the wait perceiver may be leveraged to identify potentialobstacles and implement one or more controls (e.g., slowing down, comingto a stop, etc.) that may not have been possible relying solely on theobstacle perceiver.

The map perceiver may include a mechanism by which behaviors arediscerned, and in some examples, to determine specific examples of whatconventions are applied at a particular locale. For example, the mapperceiver may determine, from data representing prior drives or trips,that at a certain intersection there are no U-turns between certainhours, that an electronic sign showing directionality of lanes changesdepending on the time of day, that two traffic lights in close proximity(e.g., barely offset from one another) are associated with differentroads, that in Rhode Island, the first car waiting to make a left turnat traffic light breaks the law by turning before oncoming traffic whenthe light turns green, and/or other information. The map perceiver mayinform the vehicle 102 of static or stationary infrastructure objectsand obstacles. The map perceiver may also generate information for thewait perceiver and/or the path perceiver, for example, such as todetermine which light at an intersection has to be green for the vehicle102 to take a particular path.

In some examples, information from the map perceiver may be sent,transmitted, and/or provided to server(s) (e.g., to a map manager ofserver(s) 778 of FIG. 7D), and information from the server(s) may besent, transmitted, and/or provided to the map perceiver and/or alocalization manager of the vehicle 102. The map manager may include acloud mapping application that is remotely located from the vehicle 102and accessible by the vehicle 102 over the network(s) 104. For example,the map perceiver and/or the localization manager of the vehicle 102 maycommunicate with the map manager and/or one or more other components orfeatures of the server(s) to inform the map perceiver and/or thelocalization manager of past and present drives or trips of the vehicle102, as well as past and present drives or trips of other vehicles. Themap manager may provide mapping outputs (e.g., map data) that may belocalized by the localization manager based on a particular location ofthe vehicle 102, and the localized mapping outputs may be used by theworld model manager 124 to generate and/or update the world model.

In any example, when a determination is made, based on information fromthe path perceiver, the wait perceiver, the map perceiver, the obstacleperceiver, and/or another component of the perception component(s) 122,that prevents the vehicle 102 from proceeding through a certainsituation, scenario, and/or environment, at least partial control may betransferred to the remote control system 106. In some examples, thepassengers of the vehicle 102 may be given an option to wait until thevehicle 102 is able to proceed based on internal rules, conventions,standards, constraints, etc., or to transfer the control to the remotecontrol system 106 to enable the remote operator to navigate the vehicle102 through the situation, scenario, and/or environment. The remoteoperator, once given control, may provide control inputs to the remotecontrol(s) 118, and the vehicle 102 may execute vehicle controlscorresponding to the control inputs that are understandable to thevehicle 102.

The planning component(s) 126 may include a route planner, a laneplanner, a behavior planner, and a behavior selector, among othercomponents, features, and/or functionality. The route planner may usethe information from the map perceiver, the map manager, and/or thelocalization manger, among other information, to generate a planned paththat may consist of GNSS waypoints (e.g., GPS waypoints). The waypointsmay be representative of a specific distance into the future for thevehicle 102, such as a number of city blocks, a number ofkilometers/miles, a number of meters/feet, etc., that may be used as atarget for the lane planner.

The lane planner may use the lane graph (e.g., the lane graph from thepath perceiver), object poses within the lane graph (e.g., according tothe localization manager), and/or a target point and direction at thedistance into the future from the route planner as inputs. The targetpoint and direction may be mapped to the best matching drivable pointand direction in the lane graph (e.g., based on GNSS and/or compassdirection). A graph search algorithm may then be executed on the lanegraph from a current edge in the lane graph to find the shortest path tothe target point.

The behavior planner may determine the feasibility of basic behaviors ofthe vehicle 102, such as staying in the lane or changing lanes left orright, so that the feasible behaviors may be matched up with the mostdesired behaviors output from the lane planner. For example, if thedesired behavior is determined to not be safe and/or available, adefault behavior may be selected instead (e.g., default behavior may beto stay in lane when desired behavior or changing lanes is not safe).

The control component(s) 128 may follow a trajectory or path (lateraland longitudinal) that has been received from the behavior selector ofthe planning component(s) 126 as closely as possible and within thecapabilities of the vehicle 102. In some examples, the remote operatormay determine the trajectory or path, and may thus take the place of oraugment the behavior selector. In such examples, the remote operator mayprovide controls that may be received by the control component(s) 128,and the control component(s) may follow the controls directly, mayfollow the controls as closely as possible within the capabilities ofthe vehicle, or may take the controls as a suggestion and determine,using one or more layers of the drive stack 108, whether the controlsshould be executed or whether other controls should be executed.

The control component(s) 128 may use tight feedback to handle unplannedevents or behaviors that are not modeled and/or anything that causesdiscrepancies from the ideal (e.g., unexpected delay). In some examples,the control component(s) 128 may use a forward prediction model thattakes control as an input variable, and produces predictions that may becompared with the desired state (e.g., compared with the desired lateraland longitudinal path requested by the planning component(s) 126). Thecontrol(s) that minimize discrepancy may be determined.

Although the planning component(s) 126 and the control component(s) 128are illustrated separately, this is not intended to be limiting. Forexample, in some embodiments, the delineation between the planningcomponent(s) 126 and the control component(s) 128 may not be preciselydefined. As such, at least some of the components, features, and/orfunctionality attributed to the planning component(s) 126 may beassociated with the control component(s) 128, and vice versa.

The obstacle avoidance component(s) 130 may aid the autonomous vehicle102 in avoiding collisions with objects (e.g., moving and stationaryobjects). The obstacle avoidance component(s) 130 may include acomputational mechanism at a “primal level” of obstacle avoidance thatmay act as a “survival brain” or “reptile brain” for the vehicle 102. Insome examples, the obstacle avoidance component(s) 130 may be usedindependently of components, features, and/or functionality of thevehicle 102 that is required to obey traffic rules and drivecourteously. In such examples, the obstacle avoidance component(s) mayignore traffic laws, rules of the road, and courteous driving norms inorder to ensure that collisions do not occur between the vehicle 102 andany objects. As such, the obstacle avoidance layer may be a separatelayer from the rules of the road layer, and the obstacle avoidance layermay ensure that the vehicle 102 is only performing safe actions from anobstacle avoidance standpoint. The rules of the road layer, on the otherhand, may ensure that vehicle obeys traffic laws and conventions, andobserves lawful and conventional right of way (as described herein).

In some examples, when controls are received from the remote controlsystem 106, the obstacle avoidance component(s) 130 may analyze thecontrols to determine whether implementing the controls would cause acollision or otherwise not result in a safe or permitted outcome. Insuch an example, when it is determined that the controls may not besafe, or may result in a collision, the controls may be aborted ordiscarded, and the vehicle 102 may implement a safety procedure to getthe vehicle 102 to a safe operating condition. The safety procedure mayinclude coming to a complete stop, pulling to the side of the road,slowing down until a collision is no longer likely or imminent, and/oranother safety procedure. In examples, when controls from the remotecontrol system 106 are determined to be unsafe, control by the remotecontrol system 106 may be transferred, at least temporarily, back to thevehicle 102.

In some examples, such as the example in FIG. 1B, the obstacle avoidancecomponent(s) 130 may be located after the control component(s) 128 inthe drive stack 108 (e.g., in order to receive desired controls from thecontrol component(s) 128, and test the controls for obstacle avoidance).However, even though the obstacle avoidance component(s) 130 are shownstacked on top of (e.g., with respect to an autonomous driving softwarestack) the planning component(s) 126 and the control component(s) 128,this is not intended to be limiting. For example, the obstacle avoidancecomponent(s) 130 may be additionally or alternatively implemented priorto either of the planning component(s) 126 or the control component(s)128, prior to the control component(s) 128 but after the planningcomponent(s) 126, as part of or integral to the planning component(s)126 and/or the control component(s) 128, as part of one or more of theperception component(s) 122, and/or at a different part of the drivestack 108 depending on the embodiment. As such, the obstacle avoidancecomponent(s) 130 may be implemented in one or more locations within anautonomous vehicle driving stack or architecture without departing fromthe scope of the present disclosure.

In some examples, as described herein, the obstacle avoidancecomponent(s) 130 may be implemented as a separate, discrete feature ofthe vehicle 102. For example, the obstacle avoidance component(s) 130may operate separately (e.g., in parallel with, prior to, and/or after)the planning layer, the control layer, the actuation layer, and/or otherlayers of the drive stack 108.

The encoder 134 may encode the sensor data from the sensor manager 120and/or the sensor(s) 110 of the vehicle 102. For example, the encoder134 may be used to convert the sensor data from a first format to asecond format, such as a compressed, down sampled, and/or lower datasize format that the first format. In such an example, the first formatmay be a raw format, a lossless format, and/or another format thatincludes more data (e.g., for image data, the first format may include araw image format, that may include enough data to fully represent eachframe of video). The second format may be in a format that includes lessdata, such as a lossy format and/or a compressed format (e.g., for imagedata, the second format may be H264, H265, MPEG-4, MP4, Advanced VideoCoding High Definition (AVCHD), Audio Video Interleave (AVI), WindowsMedia Video (WMV), etc.). The sensor data may be compressed to a smallerdata size in order to ensure efficient and effective transmission of thesensor data over the network(s) 104 (e.g., cellular networks, such as5G).

Once the sensor data is encoded by the encoder 134, a communicationcomponent 136 of the vehicle 102 may transmit or send the encoded sensordata to the remote control system 106. Although the sensor data isdescribed as being transmitted as encoded sensor data, this is notintended to be limiting. In some examples, there may not be an encoder134, and/or at least some of the sensor data may be transmitted in anuncompressed or non-encoded format.

The remote control system 106 may receive the sensor data atcommunication component 140 of the remote control system 106. Where acommunication is received and/or transmitted as a network communication,the communication component 136 and/or 140 may comprise a networkinterface which may use one or more wireless antenna(s) and/or modem(s)to communicate over one or more networks. By including one or moremodems and/or one or more wireless antennas, the vehicle 102 may becapable of communication across different network types (e.g., Wi-Fi,cellular 4G, LTE, 5G, etc.), and may also have redundancy for when oneor more networks may not be available, when one or more networks may nothave a strong enough connection to transmit the sensor data, and/or forwhen one or more of the modems goes offline or stops working. Forexample, the network interface may be capable of communication overLong-Term Evolution (LTE), Wideband Code-Division Multiple Access(WCDMA), Universal Mobile Telecommunications Service (UMTS), GlobalSystem for Mobile communications (GSM), CDMA2000, etc. The networkinterface may also enable communication between objects in theenvironment (e.g., vehicles, mobile devices, etc.), using local areanetwork(s), such as Bluetooth, Bluetooth Low Energy (LE), Z-Wave,ZigBee, etc., and/or Low Power Wide-Area Network(s) (LPWANs), such asLong Range Wide-Area Network (LoRaWAN), SigFox, etc.

In some examples, such as where the network strength is below athreshold, or a certain network type is not available for connection(e.g., only a 4G cellular connection is available, and 5G ispreferable), only required or necessary sensor data may be transmittedto the remote control system 106 (or required or necessary sensor datamay be prioritized in fitting the sensor data into network constraints).For example, during standard or normal operation, all of the sensor datamay be transmitted to the remote control system 106 (e.g., sensor datafrom each of the sensors 110 that generate sensor data for use by theremote control system 106). However, once the network signal drops belowa threshold signal strength, or once a certain network type becomesunavailable, less sensor data, such as sensor data from a subset of thesensors 110, may be transmitted.

In such examples, orientation data representative of an orientation ofthe VR headset 116 of the remote control system 106 may be used. Forexample, if the remote operator is looking toward the left-front of thevirtual vehicle within the virtual environment, the sensor data from thesensor(s) 110 that have a field(s) of view of the left-front of thevehicle 102 may be determined. These sensor(s) 110 may be a left-facingcamera(s), a forward-facing camera(s), a LIDAR sensor and/or RADARsensor(s) with a field(s) of view to the left and/or front of thevehicle 102 and/or other sensor types. The orientation data may be usedto inform the vehicle 102 (e.g., via one or more signals) of a subset ofthe sensor data that should be transmitted to the remote control system106. As a result (e.g., based on the signal(s)), the subset of thesensor data may be encoded and transmitted across the network(s) 104 tothe remote control system 106. As the remote operator continues to lookaround the virtual environment, updated orientation data may begenerated and transmitted over the network(s) 104 to the vehicle 102,and updated subsets of the sensor data may be received by the remotecontrol system 106. As a result, the remote operator may be presentedwith a field of view that includes information relevant to where theremote operator is looking, and the other portions of the virtualenvironment may not be streamed or rendered.

In some examples, a subset of the sensor data may be transmitted to theremote control system 106 that enables the virtual environment 156 to berendered without providing any image data (e.g., images or video of thereal-world or physical environment). For example, locations of objects,surfaces, and/or structures, as well as types of objects, surfaces,and/or structures may be determined from the sensor data, and thisinformation may be transmitted to the remote control system 106 forgenerating a completely synthetic virtual environment (e.g., no imagesor video of the real or physical world, just a virtual world). In suchan example, if it is determined a vehicle is to the left of the vehicle102, and a person is to the right, the virtual environment may berendered to include a vehicle and a person (e.g., genericrepresentations) at locations that correspond to the real-world. In amore detailed example, the vehicle type of the vehicle may bedetermined, and the virtual environment may include a virtualrepresentation of the vehicle type (e.g., as determined from a datastore).

In other examples, a combination of a fully rendered virtual environmentand image data (e.g., images or video) may be used within the virtualenvironment. For example, images or video may be included within thevirtual environment in a field of view of the remote operator, but otherportions of the virtual environment may include only virtualrepresentations. As a result, if a remote operator changes orientation,and image data has not yet been received for the updated field of viewof the remote operator, there may still be enough information within theenvironment (e.g., the virtual representations of the objects, surfaces,and/or structures) based on the rendering to allow the remote operatorto control the vehicle 102 safely.

Although the signal strength or connection type is described as a reasonfor transmitting only a subset of the sensor data, this is not intendedto be limiting. For example, the subset of the sensor data may betransmitted at all times, regardless of network connection strengthand/or type, in order to reduce bandwidth or preserve network resources.

In some examples, once received by the remote control system 106, thesensor data (e.g., encoded sensor data) may be decoded by decoder 142 ofthe remote control system 106. In other examples, the encoded sensordata may be used by the virtual environment generator 114 and/or theremote control(s) 118 (e.g., for calibration) without decoding. Thevirtual environment generator 114 may use the sensor data to generatethe virtual environment. The sensor data may include image data fromcamera(s), LIDAR data from LIDAR sensor(s), RADAR data from RADARsensor(s), and/or other data types from other sensor(s) 110, such asvehicle state data and/or configuration data, as described herein. Thevirtual environment generator 114 may use the sensor data to generate orrender the virtual environment and at least a portion of the virtualenvironment may be displayed on a display of the VR headset 116.Examples of the virtual environment are described in more detail herein,such as with reference to FIGS. 2A-2B.

In some examples, the virtual environment may be generated using thevehicle state data and/or the calibration data, in addition to imagedata, LIDAR data, SONAR data, etc. In such examples, the vehicle statedata may be used to update a location and/or orientation of the virtualvehicle in the virtual environment and/or to update visual indicators ofthe vehicle state in the virtual environment (e.g., to update aspeedometer, a revolutions per minute (RPM) display, a fuel leveldisplay, a current time where the vehicle 102 is located, an odometer, atachometer, a coolant temperature gauge, a battery charge indicator, agearshift indicator, a turn signal indicator, a headlight/high beamindicator, a malfunction/maintenance indicator, etc.). As a furtherexample, the vehicle state data may be used to apply one or morerendering effects to the virtual environment, such as motion blur thatis based at least in part on the velocity and/or acceleration of thevehicle 102.

In some examples, state data may be determined by the vehicle 102 forthe objects and surface in the environment, and this state informationmay be used to generate the virtual environment (e.g., to provide visualindicators of types of objects, such as persons, vehicles, animals,inanimate objects, etc., or surfaces, such as a paved road, a gravelroad, an uneven road, an even road, a driveway, a one-way street, atwo-way street, etc., to provide visual indicators about objects, suchas speeds of objects, directions of objects, etc., and/or otherinformation pertaining to the environment).

The calibration data may be used to update the virtual controls (e.g.,the representation of the remote control(s) 118 in the virtualenvironment). For some non-limiting examples, if the steering wheel isturned to the left, the virtual steering wheel may be rendered as turnedto the left, if the wheels are turned to the right, the virtual wheelsmay be rendered to be turned to the right, if the windows are down, thevirtual windows may be rendered to be down, if the seats are in acertain position, the virtual seats may be rendered to be in the certainpositions, if the instrument panel and/or HMI display is on, at acertain light level, and/or showing certain data, the virtual instrumentpanel and/or HMI display may be on, at the certain light level, and/orshowing the certain data in the virtual environment.

Any other examples for updating the virtual environment to reflect thevehicle 102 and/or other aspects of the real-world environment arecontemplated within the scope of the present disclosure. By updating atleast a portion of the virtual vehicle and/or other features of thevirtual environment using the calibration data, the remote operator mayhave a more immersive, true-to-life, and realistic virtual environmentto control the virtual vehicle within, thereby contributing to theability of the remote operator to control the vehicle 102 in thereal-world environment more safely and effectively.

At least some of the sensor data may be used by the remote control(s)118, such as the calibration data for calibrating the remote control(s)118. For example, similar to described herein with respect to updatingthe virtual environment using the calibration data, the remotecontrol(s) 118 may be calibrated using the calibration data. In someexamples, a steering component (e.g., a steering wheel, a joystick,etc.) of the remote control(s) 118 may be calibrated to an initialposition that corresponds to the position of steering component 112A ofthe vehicle 102 at the time of transfer of the control to the remotecontrol system 106. In another example, the steering componentsensitivity may be calibrated using the calibration data, such thatinputs to the steering component of the remote control(s) 118 (e.g.,turning the steering wheel x number of degrees to the left)substantially correspond to the inputs to the steering component 112A ofthe vehicle 102 (e.g., the resulting actuation of the vehicle 102 maycorrespond to turning the steering wheel of the vehicle 102 x number ofdegrees to the left). Similar examples may be implemented for theacceleration component and/or the braking component of the remotecontrol(s) to correspond to the sensitivity, degree of movement, pedalstiffness, and/or other characteristics of acceleration component 112Cand braking component 112B, respectively, of the vehicle 102. In someexamples, any of these various calibrations may be based at least inpart on the year, make, model, type, and/or other information of thevehicle 102 (e.g., if the vehicle 102 is a Year N, Make X, Model Y, thevirtual vehicle may retrieve associated calibration settings from a datastore).

In some examples, the calibration data may be used calibrate the remotecontrol(s) 118 such that the remote control(s) are scaled to the vehicle102 (or object, such as a robot), such as where the vehicle is larger,smaller, or of a different type than the virtual vehicle. For example,the vehicle 102 or object may be a small vehicle or object (e.g., thatcannot fit passengers), such as a model car or an exploratory vehicle(e.g., for navigating into tight or constrained environments, such astunnels, beneath structures, etc.), etc., or may be a larger object,such as a bus, a truck, etc. In such examples, calibration data may beused to scale the remote control(s) 118 to that of the smaller, larger,or different type of object or vehicle. For example, providing an inputto the steering component of the remote control(s) 118, such as byturning a steering wheel 10 degrees, may be scaled for a smaller vehicleto 2 degrees, or for a larger vehicle, to 20 degrees. As anotherexample, the braking component of the remote control(s) 118 maycorrespond to anti-skid braking control inputs, but the vehicle 102 orobject, especially when small, may use skid braking. In such examples,the remote control(s) may be calibrated such that inputs to the brakingcomponent of the remote control(s) is adjusted for skid braking.

The scaling may additionally, or alternatively, be performed on theoutputs of the remote control(s) (e.g., the control data). For example,after the control inputs to the remote control(s) 118, the controlinputs may be scaled to correspond to the control(s) of the smaller,larger, or different type of vehicle 102 or object. This may allow theremote operator to control the virtual vehicle or object using theremote control(s) 118 in a way that feels more natural to the remoteoperator, but while calibrating or scaling the control datarepresentative of the control inputs for the vehicle 102 or other objectto correspond to the vehicle control data that is useable for thevehicle 102 or other object. In some examples, this may be performed bythe encoder 144 of the remote control system 106, and/or by anothercomponent.

In any example, prior to transmission of the control data to the vehicle102, the control data may be encoded by the encoder 144. The encodedcontrol data may be in a format that is useable to the vehicle (e.g.,the control data from the remote control(s) 118 may be encoded togenerate vehicle control data that is useable by the vehicle 102). Inother examples, the control data may be transmitted to the vehicle 102over the network(s) 104 using the communication components 140 and 136,and the vehicle 102 may encode the control data to generate the vehiclecontrol data. As such, the control data from the remote control(s) 118may be converted to the vehicle control data prior to transmission bythe remote control system 106, after receipt by the vehicle 102, or acombination thereof.

The control data, in some examples, may be received by the communicationcomponent 136 of the vehicle 102 and decoded by the decoder 138. Thevehicle control data may then be used by at least one of the layers ofthe drive stack 108 or may bypass the drive stack 108 (e.g., where fullcontrol is transferred to the remote control system 106 and the vehicle102 exits self-driving or autonomous mode completely) and be passeddirectly to the control components of the vehicle 102, such as thesteering component 112A, the braking component 112B, the accelerationcomponent 112C, and/or other components (e.g., a blinker, lightswitches, seat actuators, etc.). As such, the amount of control given tothe remote control system 106 may include from no control, full control,or partial control. The amount of control of the autonomous vehicle 102may inversely correspond to the amount of control given to the remotecontrol system 106. Thus, when the remote control system 106 has fullcontrol, the autonomous vehicle 102 may not execute any on-boardcontrol, and when the remote control system 106 has no control, theautonomous vehicle 102 may execute all on-board control.

In examples where the vehicle control data (e.g., corresponding to thecontrol data generated based on control inputs to the remote control(s)118) is used by the drive stack 108, there may be different levels ofuse. In some examples, only the obstacle avoidance component(s) 130 maybe employed. In such examples, the vehicle control data may be analyzedby the obstacle avoidance component(s) 130 to determine whetherimplementing the controls corresponding to the vehicle control datawould result in a collision or an otherwise unsafe or undesirableoutcome. When a collision or unsafe outcome is determined, the vehicle102 may implement other controls (e.g., controls that may be similar tothe controls corresponding to the vehicle control data but thatdecrease, reduce, or remove altogether the risk of collision or otherunsafe outcome). In the alternative, the vehicle 102 may implement asafety procedure when a collision or other unsafe outcome is determined,such as by coming to a complete stop. In these examples, the controlinputs from the remote control(s) 118 may be associated (e.g.,one-to-one) with the controls of the vehicle 102 (e.g., the controlinputs to the remote control(s) 118 may not be suggestions for controlof the vehicle, such as waypoints, but rather may correspond to controlsthat should be executed by the vehicle 102).

As described herein, the control inputs from the remote control(s) 118may not be direct or one-to-one controls for the vehicle 102, in someexamples. For example, the control inputs to the remote control(s) 118may be suggestions. One form of suggestion may be an actual input to asteering component, an acceleration component, a braking component, oranother component of the remote control(s) 118. In such an example, thevehicle control data corresponding to these control inputs to the remotecontrol(s) 118 may be used by the drive stack 108 to determine how much,or to what degree, to implement the controls. For example, if the remoteoperator provides an input to a steering component of the remotecontrol(s) 118 (e.g., to turn a steering wheel 10 degrees), the planningcomponent(s) 126 and/or the control component(s) 128 of the drive stack108 may receive the vehicle control data representative of the input tothe steering component, and determine to what degree to turn to the left(or to not turn left at all). The drive stack 108 may make adetermination to turn left, for example, but may determine that a moregradual turn is safer, follows the road shape or lane markings moreaccurately, and/or otherwise is preferable over the rate of the turnprovided by the remote operator (e.g., the 10 degree turn of thesteering wheel). As such, the vehicle control data may be updated and/ornew vehicle control data may be generated by the drive stack 108, andexecuted by the steering component 112A of the vehicle 102 (e.g., basedat least in part on a command or signal from the actuation component(s)132).

Similar use of the vehicle control data may be performed based at leastin part on inputs to the acceleration component, braking component,and/or other components of the remote control(s) 118. For example, aninput to an acceleration component of the remote control(s) 118 maycause an acceleration by the acceleration component 112C of the vehicle102, but the acceleration rate may be less, more, or zero, depending onthe determination(s) by the drive stack 108. As another example, aninput to a braking component of the remote control(s) 118 may cause abraking by the braking component 112B of the vehicle 102, but thedeceleration rate may be less, more, or zero, depending on thedetermination(s) by the drive stack 108.

Another form of suggestions from the remote control(s) 118 may bewaypoint suggestions. For example, the remote operator may use a remotecontrol 118 that is a pointer (e.g., a virtual laser pointer), and maypoint to virtual locations in the virtual environment that the virtualvehicle is to navigate to (e.g., a virtual waypoint). The real-worldlocations in the real-world environment that correspond to the virtuallocations in the virtual environment may be determined, and the vehiclecontrol data may represent the real-world locations (e.g., thereal-world waypoints). As such, the drive stack 108, such as theplanning component(s) 126 and/or the control component(s) 128, may usethe real-world waypoint to determine a path and/or control(s) forfollowing the path to reach the real-world waypoint. The actuationcomponent(s) 132 may then cause the steering component 112A, the brakingcomponent 112B, the acceleration component 112C, and/or other componentsof the vehicle 102 to control the vehicle 102 to travel to thereal-world location corresponding to the real-world waypoint. The remoteoperator may continue to provide these control inputs to navigate thevehicle 102 through the situation, scenario, and/or environment thatnecessitated the transfer of at least partial control to the remotecontrol system 106.

Now referring to FIGS. 2A-2B, FIGS. 2A-2B illustrate non-limitingexamples of virtual environments that may be generated by the virtualenvironment generator 114. In one or more embodiments, the virtualenvironments may be displayed on a display of the VR headset 116.Alternatively, the virtual environments may be displayed on a displaycorresponding to a physical representation of a vehicle. The physicalrepresentation may include any configuration of control (e.g., asteering wheel, one or more accelerators or brakes, one or moretransmission controls), seating, or visibility (e.g., one or moredisplays positioned as mirrors) features corresponding to physical,real-world counterparts in an ego-vehicle. Virtual environment 200 ofFIG. 2A may include a virtual environment where an exterior of a vehicle202 is rendered, such that a field of view of the remote operatorincludes the exterior of the vehicle 202. In one or more embodiments,the vehicle 202 may be presented as a virtually simulated vehicle.Alternatively, the virtual environment 200 may be rendered in one ormore displays positioned around a partially or completely physicalvehicle 202 calibrated to correspond to the ego-vehicle. In the caseswhere the vehicle 202 comprises a virtual vehicle, the virtual vehicle202 may be rendered on a surface 204 of the virtual environment 200. Inthis case, the surface 204 may be one of any number of suitablesurfaces, such as a representation of a garage floor, a laboratoryfloor, etc. However, this is not intended to be limiting, and in someexamples, the surface 204 may be rendered to represent the surface thevehicle 102 is on in the real-world environment (e.g., using sensor datagenerated from cameras with a field(s) of view of the surface around thevehicle 102, such as a parking camera(s)).

The sensor data, such as image data, representative of a field(s) ofview of the sensor(s) 110 may be displayed within the virtualenvironment 200 on one or more virtual displays 206, such as the virtualdisplays 206A, 206B, 206C, and/or addition or alternative virtualdisplays 206. In some examples, the virtual display(s) 206 may berendered to represent up to a 360 degree field of view of the sensor(s)110 of the vehicle 102. As described herein, the surface 204 and/or anupper portion 208 of the virtual environment 200 may also be rendered torepresent the real-world environment of the vehicle 102. The upperportion 208 may include buildings, trees, the sky, and/or other featuresof the real-world environment, such that the virtual environment 200 mayrepresent a fully immersive environment. The surface 204 and/or theupper portion 208, similar to the virtual display(s) 206, may includeimages or video from image data generated by the vehicle 102, mayinclude rendered representations of the environment as gleaned from thesensor data (e.g., image data, LIDAR data, RADAR data, etc.), or acombination thereof.

The instance of the virtual environment 200 illustrated in FIG. 2A mayrepresent the scenario represented in the image 146. For example, thevirtual display 206B may include the virtual representations of the van152, the boxes 154, the street 148, and/or other features of the image146. The virtual representations of the image data may include theimages or video from the image data, rendered within the virtualenvironment 200. As such, the images or video displayed on the virtualdisplays 206 may be the actual images or video (e.g., not a virtualrepresentation thereof). In other examples, the images or videodisplayed on the virtual displays 206 may be a rendered representationof the environment, which may be generated from the sensor data (e.g.,the image data, the LIDAR data, the SONAR data, etc.).

As described herein, the vehicle state data and/or the calibration datamay be used to generate the virtual environment. In such examples,wheels 210 of the virtual vehicle 202 may be rendered at approximatelythe wheel angle of the wheels of the vehicle 102 in the real-worldenvironment. In this illustration, the wheels may be straight.Similarly, lights may be turned on or off, including brake lights whenbraking, emergency lights when turned on, etc. When the vehicle 202includes a physical, tangible representation, the vehicle state dataand/or the calibration data of the ego-vehicle may be used to calibrateand orient the physical representation vehicle 202.

When controlling a virtual vehicle 202 implemented as a virtual vehiclein the virtual environment 200, or other virtual environments where thevantage point of the remote operator is outside of the virtual vehicle202, the remote operator may be able to move around the virtualenvironment 200 freely to control the virtual vehicle 202 from differentvantage points (or may be able to change the vantage point to inside thevirtual vehicle, as illustrated in FIG. 2B). For example, the remoteoperator may be able to sit on top of or above the virtual vehicle 202,to the side of the virtual vehicle 202, in front of the virtual vehicle202, behind the virtual vehicle 202, etc.

In examples where the remote operator provides virtual waypoints ratherthan actual controls, a vantage point outside of the virtual vehicle 202may be more useful. For example, the remote operator may have a vantagepoint from on top of the virtual vehicle 202, such as at location 212within the virtual environment 200, and may use device 214 (e.g., avirtual pointer, a virtual laser, etc.) to identify a location withinthe virtual environment 200 and/or a location within the image datarepresented within the virtual environment 200, such as location 216.When the location 216 corresponds to the image data, such as a point(s)or pixel(s) within the image data, the real-world coordinatescorresponding to the point(s) or the pixel(s) may be determined (e.g.,by the vehicle 102 and/or the remote control system 106). For example,the camera(s) that captured the image data may be calibrated such thattransformations from two-dimensional locations of the point(s) or thepixel(s) within the image data to three-dimensional points in thereal-world environment may be computed or known. As a result, thevirtual waypoints (e.g., the location 216) identified within the virtualenvironment 200 by the remote operator may be used to determinereal-world locations (e.g., corresponding to the location 216) for thevehicle 102 to navigate to. As described herein, the vehicle 102 may usethis information to determine the path, controls, and/or actuations thatwill control the vehicle 102 to the real-world location.

As the vehicle 102 is controlled through the real-world environment, thevirtual display(s) 206 may be updated to reflect the updated sensor dataover time (e.g., at the frame rate that the sensor data is captured,such as 30 frames per second (“fps”), 60 fps, etc.). As the (virtual)vehicle 202 is being controlled, the wheels, lights, windows, blinkers,etc., may be updated according to the corresponding features on thevehicle 102 in the real-world environment.

Now referring to FIG. 2B, the virtual environment 156 may be the samevirtual represent 156 of FIG. 1A, described herein. Although the vantagepoint illustrated in FIG. 2B is from a left-side driver’s seat withinthe virtual vehicle, this is not intended to be limiting. For example,and without departing from the scope of the present disclosure, theremote operator may have a vantage point from the position a right-sidedriver’s seat (e.g., for jurisdictions where driving is on the left sideof the road), a passenger’s seat, a back seat, an imaginary seat (e.g.,a middle-driver’s seat), or from a vantage point within the virtualvehicle not corresponding to a seat, such as from anywhere within thevirtual vehicle.

As described herein, one or more of the features of the virtual vehiclemay be made at least partially transparent and/or may be removed fromthe rendering of the virtual vehicle. For example, certain portions of areal-world vehicle (alternatively referred to herein as “ego-vehicle” or“physical vehicle”) may be used for structural support, but may causeocclusions for a driver (e.g., “blind spots). In a virtual vehicle, thisneed for structural support is non-existent, so portions of the virtualvehicle that may be visually occluding may be removed and/or made atleast partially transparent in the virtual environment 156. For example,the support column 166, and/or other support columns of the virtualvehicle, may be made transparent (as illustrated in FIG. 2B) or may beremoved completely from the rendering. In other examples, doors 222 maybe made transparent (e.g. but for an outline) or entirely removed. As aresult, the remote operator may be presented with a field(s) of viewthat is more immersive, with less occlusions, thereby facilitating moreinformed, safer control.

In addition, a portion(s) of the virtual vehicle may be made at leastpartially transparent or be removed even where the portion(s) of thevirtual vehicle does not cause occlusions, in order to allow the remoteoperator to visualize information about the virtual vehicle (and thusthe vehicle 102) that would not be possible in a real-world environment.For example, a portion of the virtual vehicle between a vantage point ofthe remote operator and one or more of the wheels and/or tires of thevehicle may be made at least partially transparent or may be removedfrom the rendering, such that the remote operator is able to visualizean angle of the wheel(s) and/or the tire(s) (e.g., where the wheelsand/or tires are at the angle based on the calibration data).

The virtual environment 156 may include, in addition to or alternativelyfrom the features described herein with respect to FIG. 1A, a virtualinstrument panel 218, virtual side-view or wing-mirrors 220, and/orother features. The virtual instrument panel 218 may display any numberof different information, such as, without limitation, a speedometer, afuel level indicator, an oil pressure indicator, a tachometer, anodometer, turn indicators, gearshift position indicators, seat beltwarning light(s), parking-brake warning light(s), engine-malfunctionlight(s), airbag (SRS) system information, lighting controls, safetysystem controls, navigation information, etc. The virtual side-view orwing-mirrors 220 may display sensor data captured by one or moresensor(s) 110 (e.g., camera(s)) of the vehicle 102 with a field(s) ofview to the rear and/or to the side of the vehicle 102 (e.g., torepresent a side-view or wing-mirror of the vehicle 102).

Now referring to FIGS. 3A-3B, each block of methods 300A and 300B,described herein, comprises a computing process that may be performedusing any combination of hardware, firmware, and/or software. Forinstance, various functions may be carried out by a processor executinginstructions stored in memory. The methods may also be embodied ascomputer-usable instructions stored on computer storage media. Themethods may be provided by a standalone application, a service or hostedservice (standalone or in combination with another hosted service), or aplug-in to another product, to name a few. In addition, methods 300A and300B are described, by way of example, with respect to the autonomousvehicle control system 100 of FIGS. 1A-1B. However, these methods mayadditionally or alternatively be executed by any one system, or anycombination of systems, including, but not limited to, those describedherein.

FIG. 3A is a flow diagram showing a method 300A of remote control of anautonomous vehicle, in accordance with some embodiments of the presentdisclosure. The method 300A, at block B302, includes determining totransfer at least partial control of a vehicle to a remote controlsystem. For example, the vehicle 102 (e.g., one or more components ofthe drive stack 108), the remote operator, a passenger, and/or anotheractor may determine to transfer at least partial control to the remotecontrol system 106. In such examples, the determination may be toactivate, initiate, or otherwise begin a remote control session. Inexamples where the vehicle 102 determines to transfer control, thedetermination, as described herein, may be based on a constraint on thevehicle 102 such as rules of the road, an obstacle in a path of thevehicle 102, etc., that may not allow the vehicle 102 to navigate asituation, scenario, and/or environment. The determination may be madebased on an analysis of sensor data of the vehicle 102, and may be madeby one or more layers of the drive stack 108 in some examples.

The method 300A, at block B304, includes receiving sensor data from asensor(s) of the vehicle. For example, sensor data from the sensor(s)110 may be received.

The method 300A, at block B306, includes encoding the sensor data togenerate encoded sensor data. For example, the sensor data may beencoded into a different format, such as a less data intense format. Ifthe sensor data includes image data, for example, the image data may beconverted from a first format (e.g., a raw image format) to a secondformat (e.g., an encoded video format, such as H.264, H.265, AV1, VP9,or another image format, including but not limited to those describedherein).

The method 300A, at block B308, includes transmitting the encoded sensordata to the remote control system for display by a virtual realityheadset of the remote control system. For example, the encoded sensordata may be transmitted to the remote control system 106 for display ona display of the VR headset 116.

The method 300A, at block B310, includes receiving control datarepresentative of at least one control input to the remote controlsystem. For example, control data representative of at least one inputto the remote control(s) 118 may be received from the remote controlsystem 106. In some examples, the control data may not be in a formatuseable by the vehicle 102, and thus may be converted or encoded tovehicle control data useable by the vehicle 102. In other examples, thecontrol data may be useable by the vehicle 102, or may have already beenencoded by the remote control system 106 and thus the control datareceived may include the vehicle control data.

The method 300A, at block B312, includes causing actuation of anactuation component(s) of the vehicle. For example, the control data(and/or the vehicle control data) may be used by the vehicle 102 tocause actuation of at least one actuation component of the vehicle 102,such as the steering component 112A, the braking component 112B, and/orthe acceleration component 112C.

FIG. 3B is an example flow diagram for a method 300B of remote controlof an autonomous vehicle, in accordance with some embodiments of thepresent disclosure. The method 300B, at block B314, includes receivingsensor data representative of a field of a view in a physicalenvironment of a sensor(s) of a vehicle. For example, sensor datarepresentative of a field(s) of view of the sensor(s) 110 of the vehicle102 in the real-world environment may be received.

The method 300B, at block B316, includes receiving vehicle stateinformation of the vehicle. For example, the vehicle state informationmay be received from the vehicle 102.

The method 300B, at block B318, includes generating a virtualenvironment. For example, the virtual environment generator 114 maygenerate a virtual environment based on the sensor data, the vehiclestate data, and/or calibration data.

The method 300B, at block B320, includes causing display of the virtualenvironment on a display of a remote control system. For example, thevirtual environment may be displayed on a display of the VR headset 116of the remote control system 106.

The method 300B, at block B322, includes generating control datarepresentative of a virtual control(s) of the vehicle. For example,control data representative of control input(s) to the remote control(s)118 for controlling a virtual vehicle may be generated.

The method 300B, at block B324, includes transmitting the control datato the vehicle. For example, the control data may be transmitted to thevehicle 102. In some examples, prior to transmission, the control datamay be encoded to create vehicle control data useable by the vehicle102.

Now referring to FIG. 4 , each block of method 400, described herein,comprises a computing process that may be performed using anycombination of hardware, firmware, and/or software. For instance,various functions may be carried out by a processor executinginstructions stored in memory. The methods may also be embodied ascomputer-usable instructions stored on computer storage media. Themethods may be provided by a standalone application, a service or hostedservice (standalone or in combination with another hosted service), or aplug-in to another product, to name a few. In addition, method 400 isdescribed, by way of example, with respect to the autonomous vehiclecontrol system 100 of FIGS. 1A-1B. However, these methods mayadditionally or alternatively be executed by any one system, or anycombination of systems, including, but not limited to, those describedherein.

FIG. 4 is an example signal flow diagram for a method 400 of remotecontrol of an autonomous vehicle, in accordance with some embodiments ofthe present disclosure. The method 400, as illustrated in FIG. 4 , maybegin at the top of the page and end at the bottom. However, this is notintended to be limiting, and one or more of the blocks may be inalternative order and/or may be removed, or one or more additional oralternative blocks may be used in the method 400 without departing fromthe scope of the present disclosure.

The method 400, at block B402, includes transferring at least partialcontrol. For example, at least partial control may be transferred by thevehicle 102 to the remote control system 106. In such examples, asignal(s), S1, may be generated and transmitted from the autonomousvehicle 102 (e.g., via the communication component 136) to the remotecontrol system 106 (e.g., via the communication component 140) to informthe remote control system 106 that at least partial control is beingtransferred. The signal(s), S1, may be representative of data indicatingthat control is being transferred. The transfer of control may not beexecuted, in some examples, as illustrated by the dashed lines. Forexample, where control of the vehicle 102 or object is always performedby the remote control system 106, there may not be a need to transfercontrol.

The method 400, at block B404, includes generating and transmittingcalibration data. For example, the sensor(s) 110 of the autonomousvehicle 102 may generate calibration data, and the autonomous vehicle102 may transmit the calibration data to the remote control system 106(e.g., via the communication component 136 and/or 140). In suchexamples, a signal(s), S2, may be generated and transmitted from theautonomous vehicle 102 to the remote control system 106 that representsthe calibration data. The generating and transmitting of the calibrationdata may not be executed, in some examples, as illustrated by the dashedlines.

The method 400, at block B406, includes calibrating remote control(s).For example, the calibration data received via the signal(s), S2, may beused by the remote control system 106 to calibrate the remote control(s)118. The calibrating of the remote control(s) 118 may not be executed,in some examples, as illustrated by the dashed lines.

The method 400, at block B408, includes generating and transmittingsensor data and/or vehicle state data. For example, the sensor(s) 110 ofthe autonomous vehicle 102 may generate sensor data and/or vehicle statedata and the autonomous vehicle 102 may transmit the sensor data and/orthe vehicle state data to the remote control system 106 (e.g., via thecommunication component 136 and/or 140). In such examples, a signal(s),S3, may be generated and transmitted from the autonomous vehicle 102 tothe remote control system 106 that represents the sensor data and/or thevehicle state data.

The method 400, at block B410, includes rendering a virtual environment.For example, the virtual environment generator 114 may generate and/orrender the virtual environment based on the sensor data, the vehiclestate data, and/or the calibration data.

The method 400, at block B412, includes displaying the virtualenvironment on a VR headset. For example, the virtual environment, or atleast a portion thereof, may be displayed on the VR headset 116 of theremote control system 106.

The method 400, at block B414, includes receiving control input(s) toremote control(s). For example, the remote operator may provide one ormore control inputs to the remote control(s) 118.

The method 400, at block B416, includes generating and transmittingcontrol data. For example, the remote control(s) 118 of the remotecontrol system 106 may generate control data based on the controlinput(s) and the remote control system 106 may transmit the control datato the autonomous vehicle 102 (e.g., via the communication component 136and/or 140). In such examples, a signal(s), S4, may be generated andtransmitted from the remote control system 106 to the autonomous vehicle102 that represents the control data.

The method 400, at block B418, includes determining vehicle control databased on the control data. For example, the autonomous vehicle 102 maydetermine whether the control data is useable by the vehicle 102 and, ifnot, may generate vehicle control data that corresponds to the controldata but that is useable by the vehicle 102.

The method 400, at block B420, includes executing control(s) based onthe vehicle control data. For example, one or more controls may beexecuted by the vehicle 102 that may correspond to the control input(s)to the remote control system 106.

Now referring to FIG. 5A, FIG. 5A is an example data flow diagramillustrating a process 500 for training an autonomous vehicle using amachine learning model(s), in accordance with some embodiments of thepresent disclosure. Any number of inputs, including but not limited tosensor data 502 and/or control data representative of control input(s)to remote control(s) 118 of the remote control system 106, may be inputinto a machine learning model(s) 504.

The machine learning model(s) 504 may generate or compute any number ofoutputs, including but not limited to vehicle control datarepresentative of vehicle control(s) 506 for controlling the vehicle102. In some examples, the output may be control data, such as thecontrol data generated by the remote control(s) 118 of the remotecontrol system 106, and the control data may be, where necessary,encoded or otherwise converted to vehicle control data representative ofthe vehicle control(s) 506 useable by the vehicle 102. In some examples,the vehicle control(s) 506 may include vehicle trajectory information,such as a path, or points along a path, that the vehicle 102 shouldnavigate along within the environment. The vehicle control(s) 506 may betransmitted or sent to a control component(s) 128, planning component(s)126, and/or other layers of the drive stack 108, and the controlcomponent(s) 128, the planning component(s) 126, and/or other layers ofthe drive stack 108 may use the vehicle control(s) 508 to control thevehicle 102 according to the vehicle control(s) 506.

The sensor data 502 may be image data, LIDAR data, SONAR data, and/ordata from one or more other sensors 110 of the vehicle 102 that may berepresentative of the real-world environment of the vehicle 102. In someexamples, the sensor data may further include vehicle state datarepresentative of the state of the vehicle 102, such as speed, velocity,acceleration, deceleration, orientation or pose, location or position inthe environment and/or other status information. This data may becaptured by and/or received from one or more of the sensors 110 of thevehicle 102, such as one or more of the IMU sensor(s) 766, speedsensor(s) 744, steering sensor(s) 740, vibration sensor(s) 742, and/orone or more sensors of the brake sensor system 746, propulsion system750, and/or steering system 754. The vehicle state data (e.g., speed,orientation, etc.) may be valuable to the machine learning model(s) 504in computing the vehicle control(s) 506 as the vehicle state data mayinform the machine learning model(s) 504 as to what vehicle control(s)506 are most useful given the current vehicle state.

For example, the vehicle 102 may transfer at least partial control tothe remote control system 106 as a result of encountering a situation,scenario, and/or environment that the vehicle 102 is not permitted tohandle autonomously (e.g., due to one or more constraints). A remoteoperator may control the virtual vehicle through the virtualenvironment, and the control inputs by the remote operator may berepresented by control data. The control data may then be encoded orconverted to vehicle control data useable by the vehicle 102, and thevehicle 102 may be controlled through the situation, scenario, and/orenvironment based on the vehicle control data. Throughout the remotecontrol session, the sensor(s) 110 of the vehicle 102 may generatesensor data 502. The sensor data 502 (e.g., image data and/or vehiclestate data) may be input into the machine learning model(s) 504, and themachine learning model(s) 504 may learn (e.g. using ground truth controldata) the vehicle control(s) 506 for navigating the situation, scenario,and/or environment, and/or similar situations, scenarios, and/orenvironments, such that during a next occurrence, the vehicle 102 may beable to navigate itself through the situation, scenario, and/orenvironment without the need for the remote control system 106.

The machine learning model(s) 504 may include any type of machinelearning model(s), such as machine learning models using linearregression, logistic regression, decision trees, support vector machines(SVM), Naïve Bayes, k-nearest neighbor (Knn), K means clustering, randomforest, dimensionality reduction algorithms, gradient boostingalgorithms, neural networks (e.g., auto-encoders, convolutional,recurrent, perceptrons, long/short terms memory, Hopfield, Boltzmann,deep belief, deconvolutional, generative adversarial, liquid statemachine, etc.), and/or other types of machine learning models.

Now referring to FIG. 5B, FIG. 5B is an example illustration of amachine learning model(s) for training an autonomous vehicle accordingto the process of FIG. 5A, in accordance with some embodiments of thepresent disclosure. The machine learning model(s) 504 of FIG. 5B may beone example of a machine learning model(s) that may be used in theprocess 500. However, the machine learning model(s) 504 of FIG. 5B isnot intended to be limiting, and the machine learning model(s) 504 mayinclude additional and/or different machine learning models than themachine learning model(s) 504 of FIG. 5B. The machine learning model(s)504 of FIG. 5B may include a convolutional neural network and thus mayalternatively be referred to herein as convolutional neural network 504or convolutional network 504.

The convolutional network 504 includes the sensor data 502representative of one or more images generated based on image data fromone or more camera(s) of the vehicle 102. In some examples, theconvolutional network 504 may also include other inputs as sensor data,such as LIDAR data, RADAR data, vehicle state data, etc. The sensor data502 may be input into convolutional stream(s) 510 of the convolutionalnetwork 504. For example, sensor data from each sensor (e.g., where twoor more sensors are used) may be input its own convolutional stream 510.

A convolutional stream 510 may include any number of layers, such as thelayers 512A-512C. One or more of the layers may include an input layer.The input layer may hold values associated with the sensor data. Forexample, the input layer may hold values representative of the raw pixelvalues of the image(s) input to the convolutional network 504 as avolume (e.g., a width, a height, and color channels (e.g., RGB), such as32 × 32 × 3).

One or more layers may include convolutional layers. The convolutionallayers may compute the output of neurons that are connected to localregions in an input layer (e.g., the input layer), each computing a dotproduct between their weights and a small region they are connected toin the input volume. A result of the convolutional layers may be anothervolume, with one of the dimensions based at least in part on the numberof filters applied (e.g., the width, the height, and the number offilters, such as 32 × 32 × 12, if 12 were the number of filters).

One or more of the layers may include a rectified linear unit (ReLU)layer. The ReLU layer(s) may apply an elementwise activation function,such as the max (0, x), thresholding at zero, for example. The resultingvolume of a ReLU layer may be the same as the volume of the input of theReLU layer.

One or more of the layers may include a pooling layer. The pooling layermay perform a down sampling operation along the spatial dimensions(e.g., the height and the width), which may result in a smaller volumethan the input of the pooling layer (e.g., 16 × 16 × 12 from the 32 × 32× 12 input volume).

One or more of the layers may include a fully connected layer. Eachneuron in the fully connected layer(s) may be connected to each of theneurons in the previous volume. The fully connected layer may computeclass scores, and the resulting volume may be 1 × 1 × number of classes.In some examples, the convolutional stream(s) 510 may include a fullyconnected layer, while in other examples, a fully connected layer 514 ofthe convolutional network 504 may be the fully connected layer for theconvolutional stream(s) 510.

Although input layers, convolutional layers, pooling layers, ReLUlayers, and fully connected layers are discussed herein with respect tothe convolutional stream(s) 510, this is not intended to be limiting.For example, additional or alternative layers may be used in theconvolutional stream(s) 510, such as normalization layers, SoftMaxlayers, and/or other layer types. Further, the order and number oflayers of the convolutional network 504 and/or the convolutional stream510 is not limited to any one architecture.

In addition, some of the layers may include parameters (e.g., weights),such as the convolutional layers and the fully connected layers, whileothers may not, such as the ReLU layers and pooling layers. In someexamples, the parameters may be learned by the convolutional stream 510and/or the fully connected layer(s) 514 during training. Further, someof the layers may include additional hyper-parameters (e.g., learningrate, stride, epochs, etc.), such as the convolutional layers, the fullyconnected layers, and the pooling layers, while other layers may not,such as the ReLU layers. The parameters and hyper-parameters are not tobe limited, and may differ depending on the embodiment.

The output of the convolutional stream(s) 510 may be input to the fullyconnected layer(s) 514 of the convolutional network 504. In addition tothe output of the convolutional stream(s) 510, variable(s), at leastsome of which may be representative of the vehicle state, may be inputto the fully connected layer(s) 514.

The machine learning model(s) 504 may be trained using example controldata (e.g., vehicle control data, trajectories, etc.) as ground truthdata and/or sensor data 502 for given inputs to the machine learningmodel 504. In some examples, the control data may be based on thecontrol inputs to the remote control(s) 118 of the remote control system106, and/or based on the vehicle control data generated as a result ofthe control inputs. In some examples, the training data may correspondto a virtual vehicle, such as a vehicle driven in a virtual simulationcomprising a virtual environment.

Now referring to FIG. 6 , each block of method 600, described herein,comprises a computing process that may be performed using anycombination of hardware, firmware, and/or software. For instance,various functions may be carried out by a processor executinginstructions stored in memory. The methods may also be embodied ascomputer-usable instructions stored on computer storage media. Themethods may be provided by a standalone application, a service or hostedservice (standalone or in combination with another hosted service), or aplug-in to another product, to name a few. In addition, method 600 maybe executed by any one system, or any combination of systems, includingbut not limited to those described herein.

FIG. 6 is an example flow diagram for a method 600 of training anautonomous vehicle using a machine learning model(s), in accordance withsome embodiments of the present disclosure. The method 600, at blockB602, includes receiving control data supplied from a remote controlsystem representative of control inputs. For example, control datarepresentative of control inputs supplied by a remote control system 106may be received (e.g., by the remote control system 106, by a modeltraining server(s), by the vehicle 102, etc.).

The method 600, at block B604, includes converting the control data tovehicle control data usable by a vehicle. For example, the control datamay be converted to vehicle control data that is useable by a vehicle(e.g., by the vehicle 102).

The method 600, at block B606, includes receiving sensor data generatedby one or more sensors of the vehicle during executing of vehiclecontrols corresponding to the vehicle control data by the vehicle. Forexample, sensor data may be received (e.g., by the remote control system106, the model training server(s), the vehicle 102, etc.), where thesensor data is or was generated during execution of vehicle controlscorresponding to the vehicle control data. The vehicle controls may bethe controls from the control component(s) 128, the planningcomponent(s) 126, the actuation component(s) 132, the steering component112A, the braking component 112B, the acceleration component 112C,and/or other components of the vehicle 102 (and/or another vehicle orobject).

The method 600, at block B608, includes applying the vehicle controldata and/or the sensor data to a machine learning model(s). For example,the sensor data (e.g., image data, LIDAR data, SONAR data, vehicle statedata, etc.) may be applied to the machine learning model(s) (e.g., themachine learning model(s) 504 of FIGS. 5A-5B). In some examples, thesensor data may be applied to the machine learning model(s) and thevehicle control data may be used as ground truth data to train themachine learning model(s).

The method 600, at block B610, includes computing, by the machinelearning model(s), vehicle control(s). For example, the machine learningmodel(s) may compute vehicle control(s) (e.g., represented as vehiclecontrol data) that correspond to the sensor data.

The method 600, at block B612, includes comparing the vehicle control(s)to ground truth data. For example, the ground truth data may include thevehicle control data and/or paths or trajectories through theenvironment as labeled or annotated within the representations of thesensor data (e.g., the images).

The method 600, at block B614, includes, based on the comparing at blockB612, updating the machine learning model(s). For example, theparameters (e.g., weights, biases, etc.) of the machine learningmodel(s) may be updated (e.g., using backpropagation, parameter updates,etc.). This process may repeat until the machine learning model(s) hasacceptable or desirable accuracy.

As a result of the method 600, the machine learning model(s) may betrained such that the machine learning model(s), once deployed, may takesensor data as an input and generate vehicle control(s) for navigatingthrough environments, situations, and/or scenarios without the need forremote control. As such, over time, the remote control system 106 may beused to train the vehicle 102 (and/or other vehicles or objects) how tonavigate different types of scenarios, situations, and/or environmentsuntil the need for remote control, or a remote control system 106, maybecome unnecessary.

Example Autonomous Vehicle

FIG. 7A is an illustration of an example autonomous vehicle 102, inaccordance with some embodiments of the present disclosure. Theautonomous vehicle 102 (alternatively referred to herein as the “vehicle102”) may include a passenger vehicle, such as a car, a truck, a bus,and/or another type of vehicle that accommodates one or more passengers.Autonomous vehicles are generally described in terms of automationlevels, defined by the National Highway Traffic Safety Administration(NHTSA), a division of the US Department of Transportation, and theSociety of Automotive Engineers (SAE) “Taxonomy and Definitions forTerms Related to Driving Automation Systems for On-Road Motor Vehicles”(Standard No. J3016-201806, published on Jun. 15, 2018, Standard No.J3016-201609, published on Sep. 30, 2016, and previous and futureversions of this standard). The vehicle 102 may be capable offunctionality in accordance with one or more of Level 3 - Level 5 of theautonomous driving levels. For example, the vehicle 102 may be capableof conditional automation (Level 3), high automation (Level 4), and/orfull automation (Level 5), depending on the embodiment.

The vehicle 102 may include components such as a chassis, a vehiclebody, wheels (e.g., 2, 4, 6, 8, 18, etc.), tires, axles, and othercomponents of a vehicle. The vehicle 102 may include a propulsion system750, such as an internal combustion engine, hybrid electric power plant,an all-electric engine, and/or another propulsion system type. Thepropulsion system 750 may be connected to a drive train of the vehicle102, which may include a transmission, to enable the propulsion of thevehicle 102. The propulsion system 750 may be controlled in response toreceiving signals from the throttle/accelerator 752.

A steering system 754, which may include a steering wheel, may be usedto steer the vehicle 102 (e.g., along a desired path or route) when thepropulsion system 750 is operating (e.g., when the vehicle is inmotion). The steering system 754 may receive signals from a steeringactuator 756. The steering wheel may be optional for full automation(Level 5) functionality.

The brake sensor system 746 may be used to operate the vehicle brakes inresponse to receiving signals from the brake actuators 748 and/or brakesensors.

Controller(s) 736, which may include one or more system on chips (SoCs)704 (FIG. 7C) and/or GPU(s), may provide signals (e.g., representativeof commands) to one or more components and/or systems of the vehicle102. For example, the controller(s) may send signals to operate thevehicle brakes via one or more brake actuators 748, to operate thesteering system 754 via one or more steering actuators 756, to operatethe propulsion system 750 via one or more throttle/accelerators 752. Thecontroller(s) 736 may include one or more onboard (e.g., integrated)computing devices (e.g., supercomputers) that process sensor signals,and output operation commands (e.g., signals representing commands) toenable autonomous driving and/or to assist a human driver in driving thevehicle 102. The controller(s) 736 may include a first controller 736for autonomous driving functions, a second controller 736 for functionalsafety functions, a third controller 736 for artificial intelligencefunctionality (e.g., computer vision), a fourth controller 736 forinfotainment functionality, a fifth controller 736 for redundancy inemergency conditions, and/or other controllers. In some examples, asingle controller 736 may handle two or more of the abovefunctionalities, two or more controllers 736 may handle a singlefunctionality, and/or any combination thereof.

The controller(s) 736 may provide the signals for controlling one ormore components and/or systems of the vehicle 102 in response to sensordata received from one or more sensors (e.g., sensor inputs). The sensordata may be received from, for example and without limitation, globalnavigation satellite systems sensor(s) 758 (e.g., Global PositioningSystem sensor(s)), RADAR sensor(s) 760, ultrasonic sensor(s) 762, LIDARsensor(s) 764, inertial measurement unit (IMU) sensor(s) 766 (e.g.,accelerometer(s), gyroscope(s), magnetic compass(es), magnetometer(s),etc.), microphone(s) 796, stereo camera(s) 768, wide-view camera(s) 770(e.g., fisheye cameras), infrared camera(s) 772, surround camera(s) 774(e.g., 360 degree cameras), long-range and/or mid-range camera(s) 798,speed sensor(s) 744 (e.g., for measuring the speed of the vehicle 102),vibration sensor(s) 742, steering sensor(s) 740, brake sensor(s) (e.g.,as part of the brake sensor system 746), and/or other sensor types.

One or more of the controller(s) 736 may receive inputs (e.g.,represented by input data) from an instrument cluster 732 of the vehicle102 and provide outputs (e.g., represented by output data, display data,etc.) via a human-machine interface (HMI) display 734, an audibleannunciator, a loudspeaker, and/or via other components of the vehicle102. The outputs may include information such as vehicle velocity,speed, time, map data (e.g., the HD map 722 of FIG. 7C), location data(e.g., the vehicle’s 102 location, such as on a map), direction,location of other vehicles (e.g., an occupancy grid), information aboutobjects and status of objects as perceived by the controller(s) 736,etc. For example, the HMI display 734 may display information about thepresence of one or more objects (e.g., a street sign, caution sign,traffic light changing, etc.), and/or information about drivingmaneuvers the vehicle has made, is making, or will make (e.g., changinglanes now, taking exit 34B in two miles, etc.).

The vehicle 102 further includes a network interface 724 which may useone or more wireless antenna(s) 726 and/or modem(s) to communicate overone or more networks. For example, the network interface 724 may becapable of communication over LTE, WCDMA, UMTS, GSM, CDMA2000, etc. Thewireless antenna(s) 726 may also enable communication between objects inthe environment (e.g., vehicles, mobile devices, etc.), using local areanetwork(s), such as Bluetooth, Bluetooth LE, Z-Wave, ZigBee, etc.,and/or low power wide-area network(s) (LPWANs), such as LoRaWAN, SigFox,etc.

FIG. 7B is an example of camera locations and fields of view for theexample autonomous vehicle 102 of FIG. 7A, in accordance with someembodiments of the present disclosure. The cameras and respective fieldsof view are one example embodiment and are not intended to be limiting.For example, additional and/or alternative cameras may be includedand/or the cameras may be located at different locations on the vehicle102.

The camera types for the cameras may include, but are not limited to,digital cameras that may be adapted for use with the components and/orsystems of the vehicle 102. The camera(s) may operate at automotivesafety integrity level (ASIL) B and/or at another ASIL. The camera typesmay be capable of any image capture rate, such as 60 frames per second(fps), 720 fps, 240 fps, etc., depending on the embodiment. The camerasmay be capable of using rolling shutters, global shutters, another typeof shutter, or a combination thereof. In some examples, the color filterarray may include a red clear clear clear (RCCC) color filter array, ared clear clear blue (RCCB) color filter array, a red blue green clear(RBGC) color filter array, a Foveon X3 color filter array, a Bayersensors (RGGB) color filter array, a monochrome sensor color filterarray, and/or another type of color filter array. In some embodiments,clear pixel cameras, such as cameras with an RCCC, an RCCB, and/or anRBGC color filter array, may be used in an effort to increase lightsensitivity.

In some examples, one or more of the camera(s) may be used to performadvanced driver assistance systems (ADAS) functions (e.g., as part of aredundant or fail-safe design). For example, a Multi-Function MonoCamera may be installed to provide functions including lane departurewarning, traffic sign assist and intelligent headlamp control. One ormore of the camera(s) (e.g., all of the cameras) may record and provideimage data (e.g., video) simultaneously.

One or more of the cameras may be mounted in a mounting assembly, suchas a custom designed (3-D printed) assembly, in order to cut out straylight and reflections from within the car (e.g., reflections from thedashboard reflected in the windshield mirrors) which may interfere withthe camera’s image data capture abilities. With reference to wing-mirrormounting assemblies, the wing-mirror assemblies may be custom 3-Dprinted so that the camera mounting plate matches the shape of thewing-mirror. In some examples, the camera(s) may be integrated into thewing-mirror. For side-view cameras, the camera(s) may also be integratedwithin the four pillars at each corner of the cabin.

Cameras with a field of view that include portions of the environment infront of the vehicle 102 (e.g., front-facing cameras) may be used forsurround view, to help identify forward facing paths and obstacles, aswell aid in, with the help of one or more controllers 736 and/or controlSoCs, providing information critical to generating an occupancy gridand/or determining the preferred vehicle paths. Front-facing cameras maybe used to perform many of the same ADAS functions as LIDAR, includingemergency braking, pedestrian detection, and collision avoidance.Front-facing cameras may also be used for ADAS functions and systemsincluding Lane Departure Warnings (“LDW”), Autonomous Cruise Control(“ACC”), and/or other functions such as traffic sign recognition.

A variety of cameras may be used in a front-facing configuration,including, for example, a monocular camera platform that includes a CMOS(complementary metal oxide semiconductor) color imager. Another examplemay be a wide-view camera(s) 770 that may be used to perceive objectscoming into view from the periphery (e.g., pedestrians, crossing trafficor bicycles). Although only one wide-view camera is illustrated in FIG.7B, there may any number of wide-view cameras 770 on the vehicle 102. Inaddition, long-range camera(s) 798 (e.g., a long-view stereo camerapair) may be used for depth-based object detection, especially forobjects for which a neural network has not yet been trained. Thelong-range camera(s) 798 may also be used for object detection andclassification, as well as basic object tracking.

One or more stereo cameras 768 may also be included in a front-facingconfiguration. The stereo camera(s) 768 may include an integratedcontrol unit comprising a scalable processing unit, which may provide aprogrammable logic (FPGA) and a multi-core micro-processor with anintegrated CAN or Ethernet interface on a single chip. Such a unit maybe used to generate a 3-D map of the vehicle’s environment, including adistance estimate for all the points in the image. An alternative stereocamera(s) 768 may include a compact stereo vision sensor(s) that mayinclude two camera lenses (one each on the left and right) and an imageprocessing chip that may measure the distance from the vehicle to thetarget object and use the generated information (e.g., metadata) toactivate the autonomous emergency braking and lane departure warningfunctions. Other types of stereo camera(s) 768 may be used in additionto, or alternatively from, those described herein.

Cameras with a field of view that include portions of the environment tothe side of the vehicle 102 (e.g., side-view cameras) may be used forsurround view, providing information used to create and update theoccupancy grid, as well as to generate side impact collision warnings.For example, surround camera(s) 774 (e.g., four surround cameras 774 asillustrated in FIG. 7B) may be positioned to on the vehicle 102. Thesurround camera(s) 774 may include wide-view camera(s) 770, fisheyecamera(s), 360 degree camera(s), and/or the like. Four example, fourfisheye cameras may be positioned on the vehicle’s front, rear, andsides. In an alternative arrangement, the vehicle may use three surroundcamera(s) 774 (e.g., left, right, and rear), and may leverage one ormore other camera(s) (e.g., a forward-facing camera) as a fourthsurround view camera.

Cameras with a field of view that include portions of the environment tothe rear of the vehicle 102 (e.g., rear-view cameras) may be used forpark assistance, surround view, rear collision warnings, and creatingand updating the occupancy grid. A wide variety of cameras may be usedincluding, but not limited to, cameras that are also suitable as afront-facing camera(s) (e.g., long-range and/or mid-range camera(s) 798,stereo camera(s) 768), infrared camera(s) 772, etc.), as describedherein.

FIG. 7C is a block diagram of an example system architecture for theexample autonomous vehicle 102 of FIG. 7A, in accordance with someembodiments of the present disclosure. It should be understood that thisand other arrangements described herein are set forth only as examples.Other arrangements and elements (e.g., machines, interfaces, functions,orders, groupings of functions, etc.) may be used in addition to orinstead of those shown, and some elements may be omitted altogether.Further, many of the elements described herein are functional entitiesthat may be implemented as discrete or distributed components or inconjunction with other components, and in any suitable combination andlocation. Various functions described herein as being performed byentities may be carried out by hardware, firmware, and/or software. Forinstance, various functions may be carried out by a processor executinginstructions stored in memory.

Each of the components, features, and systems of the vehicle 102 in FIG.7C are illustrated as being connected via bus 702. The bus 702 mayinclude a Controller Area Network (CAN) data interface (alternativelyreferred to herein as a “CAN bus”). A CAN may be a network inside thevehicle 102 used to aid in control of various features and functionalityof the vehicle 102, such as actuation of brakes, acceleration, braking,steering, windshield wipers, etc. A CAN bus may be configured to havedozens or even hundreds of nodes, each with its own unique identifier(e.g., a CAN ID). The CAN bus may be read to find steering wheel angle,ground speed, engine revolutions per minute (RPMs), button positions,and/or other vehicle status indicators. The CAN bus may be ASIL Bcompliant.

Although the bus 702 is described herein as being a CAN bus, this is notintended to be limiting. For example, in addition to, or alternativelyfrom, the CAN bus, FlexRay and/or Ethernet may be used. Additionally,although a single line is used to represent the bus 702, this is notintended to be limiting. For example, there may be any number of busses702, which may include one or more CAN busses, one or more FlexRaybusses, one or more Ethernet busses, and/or one or more other types ofbusses using a different protocol. In some examples, two or more busses702 may be used to perform different functions, and/or may be used forredundancy. For example, a first bus 702 may be used for collisionavoidance functionality and a second bus 702 may be used for actuationcontrol. In any example, each bus 702 may communicate with any of thecomponents of the vehicle 102, and two or more busses 702 maycommunicate with the same components. In some examples, each SoC 704,each controller 736, and/or each computer within the vehicle may haveaccess to the same input data (e.g., inputs from sensors of the vehicle102), and may be connected to a common bus, such the CAN bus.

The vehicle 102 may include one or more controller(s) 736, such as thosedescribed herein with respect to FIG. 7A. The controller(s) 736 may beused for a variety of functions. The controller(s) 736 may be coupled toany of the various other components and systems of the vehicle 102, andmay be used for control of the vehicle 102, artificial intelligence ofthe vehicle 102, infotainment for the vehicle 102, and/or the like.

The vehicle 102 may include a system(s) on a chip (SoC) 704. The SoC 704may include CPU(s) 706, GPU(s) 708, processor(s) 710, cache(s) 712,accelerator(s) 714, data store(s) 716, and/or other components andfeatures not illustrated. The SoC(s) 704 may be used to control thevehicle 102 in a variety of platforms and systems. For example, theSoC(s) 704 may be combined in a system (e.g., the system of the vehicle102) with an HD map 722 which may obtain map refreshes and/or updatesvia a network interface 724 from one or more servers (e.g., server(s)778 of FIG. 7D).

The CPU(s) 706 may include a CPU cluster or CPU complex (alternativelyreferred to herein as a “CCPLEX”). The CPU(s) 706 may include multiplecores and/or L2 caches. For example, in some embodiments, the CPU(s) 706may include eight cores in a coherent multiprocessor configuration. Insome embodiments, the CPU(s) 706 may include four dual-core clusterswhere each cluster has a dedicated L2 cache (e.g., a 2 MB L2 cache). TheCPU(s) 706 (e.g., the CCPLEX) may be configured to support simultaneouscluster operation enabling any combination of the clusters of the CPU(s)706 to be active at any given time.

The CPU(s) 706 may implement power management capabilities that includeone or more of the following features: individual hardware blocks may beclock-gated automatically when idle to save dynamic power; each coreclock may be gated when the core is not actively executing instructionsdue to execution of WFI/WFE instructions; each core may be independentlypower-gated; each core cluster may be independently clock-gated when allcores are clock-gated or power-gated; and/or each core cluster may beindependently power-gated when all cores are power-gated. The CPU(s) 706may further implement an enhanced algorithm for managing power states,where allowed power states and expected wakeup times are specified, andthe hardware/microcode determines the best power state to enter for thecore, cluster, and CCPLEX. The processing cores may support simplifiedpower state entry sequences in software with the work offloaded tomicrocode.

The GPU(s) 708 may include an integrated GPU (alternatively referred toherein as an “iGPU”). The GPU(s) 708 may be programmable and may beefficient for parallel workloads. The GPU(s) 708, in some examples, mayuse an enhanced tensor instruction set. The GPU(s) 708 may include oneor more streaming microprocessors, where each streaming microprocessormay include an L1 cache (e.g., an L1 cache with at least 96KB storagecapacity), and two or more of the streaming microprocessors may share anL2 cache (e.g., an L2 cache with a 512 KB storage capacity). In someembodiments, the GPU(s) 708 may include at least eight streamingmicroprocessors. The GPU(s) 708 may use compute application programminginterface(s) (API(s)). In addition, the GPU(s) 708 may use one or moreparallel computing platforms and/or programming models (e.g., NVIDIA’sCUDA).

The GPU(s) 708 may be power-optimized for best performance in automotiveand embedded use cases. For example, the GPU(s) 708 may be fabricated ona Fin field-effect transistor (FinFET). However, this is not intended tobe limiting and the GPU(s) 708 may be fabricated using othersemiconductor manufacturing processes. Each streaming microprocessor mayincorporate a number of mixed-precision processing cores partitionedinto multiple blocks. For example, and without limitation, 64 PF32 coresand 32 PF64 cores may be partitioned into four processing blocks. Insuch an example, each processing block may be allocated 16 FP32 cores, 8FP64 cores, 16 INT32 cores, two mixed-precision NVIDIA TENSOR COREs fordeep learning matrix arithmetic, an L0 instruction cache, a warpscheduler, a dispatch unit, and/or a 64 KB register file. In addition,the streaming microprocessors may include independent parallel integerand floating-point data paths to provide for efficient execution ofworkloads with a mix of computation and addressing calculations. Thestreaming microprocessors may include independent thread schedulingcapability to enable finer-grain synchronization and cooperation betweenparallel threads. The streaming microprocessors may include a combinedL1 data cache and shared memory unit in order to improve performancewhile simplifying programming.

The GPU(s) 708 may include a high bandwidth memory (HBM) and/or a 16 GBHBM2 memory subsystem to provide, in some examples, about 900 GB/secondpeak memory bandwidth. In some examples, in addition to, oralternatively from, the HBM memory, a synchronous graphics random-accessmemory (SGRAM) may be used, such as a graphics double data rate typefive synchronous random-access memory (GDDR5).

The GPU(s) 708 may include unified memory technology including accesscounters to allow for more accurate migration of memory pages to theprocessor that accesses them most frequently, thereby improvingefficiency for memory ranges shared between processors. In someexamples, address translation services (ATS) support may be used toallow the GPU(s) 708 to access the CPU(s) 706 page tables directly. Insuch examples, when the GPU(s) 708 memory management unit (MMU)experiences a miss, an address translation request may be transmitted tothe CPU(s) 706. In response, the CPU(s) 706 may look in its page tablesfor the virtual-to-physical mapping for the address and transmits thetranslation back to the GPU(s) 708. As such, unified memory technologymay allow a single unified virtual address space for memory of both theCPU(s) 706 and the GPU(s) 708, thereby simplifying the GPU(s) 708programming and porting of applications to the GPU(s) 708.

In addition, the GPU(s) 708 may include an access counter that may keeptrack of the frequency of access of the GPU(s) 708 to memory of otherprocessors. The access counter may help ensure that memory pages aremoved to the physical memory of the processor that is accessing thepages most frequently.

The SoC(s) 704 may include any number of cache(s) 712, including thosedescribed herein. For example, the cache(s) 712 may include an L3 cachethat is available to both the CPU(s) 706 and the GPU(s) 708 (e.g., thatis connected both the CPU(s) 706 and the GPU(s) 708). The cache(s) 712may include a write-back cache that may keep track of states of lines,such as by using a cache coherence protocol (e.g., MEI, MESI, MSI,etc.). The L3 cache may include 4 MB or more, depending on theembodiment, although smaller cache sizes may be used.

The SoC(s) 704 may include one or more accelerators 714 (e.g., hardwareaccelerators, software accelerators, or a combination thereof). Forexample, the SoC(s) 704 may include a hardware acceleration cluster thatmay include optimized hardware accelerators and/or large on-chip memory.The large on-chip memory (e.g., 4MB of SRAM), may enable the hardwareacceleration cluster to accelerate neural networks and othercalculations. The hardware acceleration cluster may be used tocomplement the GPU(s) 708 and to off-load some of the tasks of theGPU(s) 708 (e.g., to free up more cycles of the GPU(s) 708 forperforming other tasks). As an example, the accelerator(s) 714 may beused for targeted workloads (e.g., perception, convolutional neuralnetworks (CNNs), etc.) that are stable enough to be amenable toacceleration. The term “CNN,” as used herein, may include all types ofCNNs, including region-based or regional convolutional neural networks(RCNNs) and Fast RCNNs (e.g., as used for object detection).

The accelerator(s) 714 (e.g., the hardware acceleration cluster) mayinclude a deep learning accelerator(s) (DLA). The DLA(s) may include oneor more Tensor processing units (TPUs) that may be configured to providean additional ten trillion operations per second for deep learningapplications and inferencing. The TPUs may be accelerators configuredto, and optimized for, performing image processing functions (e.g., forCNNs, RCNNs, etc.). The DLA(s) may further be optimized for a specificset of neural network types and floating point operations, as well asinferencing. The design of the DLA(s) may provide more performance permillimeter than a general-purpose GPU, and vastly exceeds theperformance of a CPU. The TPU(s) may perform several functions,including a single-instance convolution function, supporting, forexample, INT8, INT16, and FP16 data types for both features and weights,as well as post-processor functions.

The DLA(s) may quickly and efficiently execute neural networks,especially CNNs, on processed or unprocessed data for any of a varietyof functions, including, for example and without limitation: a CNN forobject identification and detection using data from camera sensors; aCNN for distance estimation using data from camera sensors; a CNN foremergency vehicle detection and identification and detection using datafrom microphones; a CNN for facial recognition and vehicle owneridentification using data from camera sensors; and/or a CNN for securityand/or safety related events.

The DLA(s) may perform any function of the GPU(s) 708, and by using aninference accelerator, for example, a designer may target either theDLA(s) or the GPU(s) 708 for any function. For example, the designer mayfocus processing of CNNs and floating point operations on the DLA(s) andleave other functions to the GPU(s) 708 and/or other accelerator(s) 714.

The accelerator(s) 714 (e.g., the hardware acceleration cluster) mayinclude a programmable vision accelerator(s) (PVA), which mayalternatively be referred to herein as a computer vision accelerator.The PVA(s) may be designed and configured to accelerate computer visionalgorithms for the advanced driver assistance systems (ADAS), autonomousdriving, and/or augmented reality (AR) and/or virtual reality (VR)applications. The PVA(s) may provide a balance between performance andflexibility. For example, each PVA(s) may include, for example andwithout limitation, any number of reduced instruction set computer(RISC) cores, direct memory access (DMA), and/or any number of vectorprocessors.

The RISC cores may interact with image sensors (e.g., the image sensorsof any of the cameras described herein), image signal processor(s),and/or the like. Each of the RISC cores may include any amount ofmemory. The RISC cores may use any of a number of protocols, dependingon the embodiment. In some examples, the RISC cores may execute areal-time operating system (RTOS). The RISC cores may be implementedusing one or more integrated circuit devices, application specificintegrated circuits (ASICs), and/or memory devices. For example, theRISC cores may include an instruction cache and/or a tightly coupledRAM.

The DMA may enable components of the PVA(s) to access the system memoryindependently of the CPU(s) 706. The DMA may support any number offeatures used to provide optimization to the PVA including, but notlimited to, supporting multi-dimensional addressing and/or circularaddressing. In some examples, the DMA may support up to six or moredimensions of addressing, which may include block width, block height,block depth, horizontal block stepping, vertical block stepping, and/ordepth stepping.

The vector processors may be programmable processors that may bedesigned to efficiently and flexibly execute programming for computervision algorithms and provide signal processing capabilities. In someexamples, the PVA may include a PVA core and two vector processingsubsystem partitions. The PVA core may include a processor subsystem,DMA engine(s) (e.g., two DMA engines), and/or other peripherals. Thevector processing subsystem may operate as the primary processing engineof the PVA, and may include a vector processing unit (VPU), aninstruction cache, and/or vector memory (e.g., VMEM). A VPU core mayinclude a digital signal processor such as, for example, a singleinstruction, multiple data (SIMD), very long instruction word (VLIW)digital signal processor. The combination of the SIMD and VLIW mayenhance throughput and speed.

Each of the vector processors may include an instruction cache and maybe coupled to dedicated memory. As a result, in some examples, each ofthe vector processors may be configured to execute independently of theother vector processors. In other examples, the vector processors thatare included in a particular PVA may be configured to employ dataparallelism. For example, in some embodiments, the plurality of vectorprocessors included in a single PVA may execute the same computer visionalgorithm, but on different regions of an image. In other examples, thevector processors included in a particular PVA may simultaneouslyexecute different computer vision algorithms, on the same image, or evenexecute different algorithms on sequential images or portions of animage. Among other things, any number of PVAs may be included in thehardware acceleration cluster and any number of vector processors may beincluded in each of the PVAs. In addition, the PVA(s) may includeadditional error correcting code (ECC) memory, to enhance overall systemsafety.

The accelerator(s) 714 (e.g., the hardware acceleration cluster) mayinclude a computer vision network on-chip and SRAM, for providing ahigh-bandwidth, low latency SRAM for the accelerator(s) 714. In someexamples, the on-chip memory may include at least 4 MB SRAM, consistingof, for example and without limitation, eight field-configurable memoryblocks, that may be accessible by both the PVA and the DLA. Each pair ofmemory blocks may include an advanced peripheral bus (APB) interface,configuration circuitry, a controller, and a multiplexer. Any type ofmemory may be used. The PVA and DLA may access the memory via a backbonethat provides the PVA and DLA with high-speed access to memory. Thebackbone may include a computer vision network on-chip thatinterconnects the PVA and the DLA to the memory (e.g., using the APB).

The computer vision network on-chip may include an interface thatdetermines, before transmission of any control signal/address/data, thatboth the PVA and the DLA provide ready and valid signals. Such aninterface may provide for separate phases and separate channels fortransmitting control signals/addresses/data, as well as burst-typecommunications for continuous data transfer. This type of interface maycomply with ISO 26262 or IEC 61506 standards, although other standardsand protocols may be used.

In some examples, the SoC(s) 704 may include a real-time ray-tracinghardware accelerator, such as described in U.S. Pat. Application No.16/101,232, filed on Aug. 10, 2018. The real-time ray-tracing hardwareaccelerator may be used to quickly and efficiently determine thepositions and extents of objects (e.g., within a world model), togenerate real0time visualization simulations, for RADAR signalinterpretation, for sound propagation synthesis and/or analysis, forsimulation of SONAR systems, for general wave propagation simulation,for comparison to LIDAR data for purposes of localization and/or otherfunctions, and/or for other uses.

The accelerator(s) 714 (e.g., the hardware accelerator cluster) have awide array of uses for autonomous driving. The PVA may be a programmablevision accelerator that may be used for key processing stages in ADASand autonomous vehicles. The PVA’s capabilities are a good match foralgorithmic domains needing predictable processing, at low power and lowlatency. In other words, the PVA performs well on semi-dense or denseregular computation, even on small data sets, which need predictablerun-times with low latency and low power. Thus, in the context ofplatforms for autonomous vehicles, the PVAs are designed to run classiccomputer vision algorithms, as they are efficient at object detectionand operating on integer math.

For example, according to one embodiment of the technology, the PVA isused to perform computer stereo vision. A semi-global matching-basedalgorithm may be used in some examples, although this is not intended tobe limiting. Many applications for Level 3-5 autonomous driving requiremotion estimation/stereo matching on-the-fly (e.g., structure frommotion, pedestrian recognition, lane detection, etc.). The PVA mayperform computer stereo vision function on inputs from two monocularcameras.

In some examples, the PVA may be used to perform dense optical flow.According to process raw RADAR data (e.g., using a 4D Fast FourierTransform) to provide Processed RADAR. In other examples, the PVA isused for time of flight depth processing, by processing raw time offlight data to provide processed time of flight data, for example.

The DLA may be used to run any type of network to enhance control anddriving safety, including for example, a neural network that outputs ameasure of confidence for each object detection. Such a confidence valuemay be interpreted as a probability, or as providing a relative “weight”of each detection compared to other detections. This confidence valueenables the system to make further decisions regarding which detectionsshould be considered as true positive detections rather than falsepositive detections. For example, the system may set a threshold valuefor the confidence and consider only the detections exceeding thethreshold value as true positive detections. In an automatic emergencybraking (AEB) system, false positive detections would cause the vehicleto automatically perform emergency braking, which is obviouslyundesirable. Therefore, only the most confident detections should beconsidered as triggers for AEB. The DLA may run a neural network forregressing the confidence value. The neural network may take as itsinput at least some subset of parameters, such as bounding boxdimensions, ground plane estimate obtained (e.g. from anothersubsystem), inertial measurement unit (IMU) sensor 766 output thatcorrelates with the vehicle 102 orientation, distance, 3D locationestimates of the object obtained from the neural network and/or othersensors (e.g., LIDAR sensor(s) 764 or RADAR sensor(s) 760), amongothers.

The SoC(s) 704 may include data store(s) 716 (e.g., memory). The datastore(s) 716 may be on-chip memory of the SoC(s) 704, which may storeneural networks to be executed on the GPU and/or the DLA. In someexamples, the data store(s) 716 may be large enough in capacity to storemultiple instances of neural networks for redundancy and safety. Thedata store(s) 712 may comprise L2 or L3 cache(s) 712. Reference to thedata store(s) 716 may include reference to the memory associated withthe PVA, DLA, and/or other accelerator(s) 714, as described herein.

The SoC(s) 704 may include one or more processor(s) 710 (e.g., embeddedprocessors). The processor(s) 710 may include a boot and powermanagement processor that may be a dedicated processor and subsystem tohandle boot power and management functions and related securityenforcement. The boot and power management processor may be a part ofthe SoC(s) 704 boot sequence and may provide runtime power managementservices. The boot power and management processor may provide clock andvoltage programming, assistance in system low power state transitions,management of SoC(s) 704 thermals and temperature sensors, and/ormanagement of the SoC(s) 704 power states. Each temperature sensor maybe implemented as a ring-oscillator whose output frequency isproportional to temperature, and the SoC(s) 704 may use thering-oscillators to detect temperatures of the CPU(s) 706, GPU(s) 708,and/or accelerator(s) 714. If temperatures are determined to exceed athreshold, the boot and power management processor may enter atemperature fault routine and put the SoC(s) 704 into a lower powerstate and/or put the vehicle 102 into a chauffeur to safe stop mode(e.g., bring the vehicle 102 to a safe stop).

The processor(s) 710 may further include a set of embedded processorsthat may serve as an audio processing engine. The audio processingengine may be an audio subsystem that enables full hardware support formulti-channel audio over multiple interfaces, and a broad and flexiblerange of audio I/O interfaces. In some examples, the audio processingengine is a dedicated processor core with a digital signal processorwith dedicated RAM.

The processor(s) 710 may further include an always on processor enginethat may provide necessary hardware features to support low power sensormanagement and wake use cases. The always on processor engine mayinclude a processor core, a tightly coupled RAM, supporting peripherals(e.g., timers and interrupt controllers), various I/O controllerperipherals, and routing logic.

The processor(s) 710 may further include a safety cluster engine thatincludes a dedicated processor subsystem to handle safety management forautomotive applications. The safety cluster engine may include two ormore processor cores, a tightly coupled RAM, support peripherals (e.g.,timers, an interrupt controller, etc.), and/or routing logic. In asafety mode, the two or more cores may operate in a lockstep mode andfunction as a single core with comparison logic to detect anydifferences between their operations.

The processor(s) 710 may further include a real-time camera engine thatmay include a dedicated processor subsystem for handling real-timecamera management.

The processor(s) 710 may further include a high-dynamic range signalprocessor that may include an image signal processor that is a hardwareengine that is part of the camera processing pipeline.

The processor(s) 710 may include a video image compositor that may be aprocessing block (e.g., implemented on a microprocessor) that implementsvideo post-processing functions needed by a video playback applicationto produce the final image for the player window. The video imagecompositor may perform lens distortion correction on wide-view camera(s)770, surround camera(s) 774, and/or on in-cabin monitoring camerasensors. In-cabin monitoring camera sensor is preferably monitored by aneural network running on another instance of the Advanced SoC,configured to identify in cabin events and respond accordingly. Anin-cabin system may perform lip reading to activate cellular service andplace a phone call, dictate emails, change the vehicle’s destination,activate or change the vehicle’s infotainment system and settings, orprovide voice-activated web surfing. Certain functions are available tothe driver only when the vehicle is operating in an autonomous mode, andare disabled otherwise.

The video image compositor may include enhanced temporal noise reductionfor both spatial and temporal noise reduction. For example, where motionoccurs in a video, the noise reduction weights spatial informationappropriately, decreasing the weight of information provided by adjacentframes. Where an image or portion of an image does not include motion,the temporal noise reduction performed by the video image compositor mayuse information from the previous image to reduce noise in the currentimage.

The video image compositor may also be configured to perform stereorectification on input stereo lens frames. The video image compositormay further be used for user interface composition when the operatingsystem desktop is in use, and the GPU(s) 708 is not required tocontinuously render new surfaces. Even when the GPU(s) 708 is powered onand active doing 3D rendering, the video image compositor may be used tooffload the GPU(s) 708 to improve performance and responsiveness.

The SoC(s) 704 may further include a mobile industry processor interface(MIPI) camera serial interface for receiving video and input fromcameras, a high-speed interface, and/or a video input block that may beused for camera and related pixel input functions. The SoC(s) 704 mayfurther include an input/output controller(s) that may be controlled bysoftware and may be used for receiving I/O signals that are uncommittedto a specific role.

The SoC(s) 704 may further include a broad range of peripheralinterfaces to enable communication with peripherals, audio codecs, powermanagement, and/or other devices. The SoC(s) 704 may be used to processdata from cameras (e.g., connected over Gigabit Multimedia Serial Linkand Ethernet), sensors (e.g., LIDAR sensor(s) 764, RADAR sensor(s) 760,etc. that may be connected over Ethernet), data from bus 702 (e.g.,speed of vehicle 102, steering wheel position, etc.), data from GNSSsensor(s) 758 (e.g., connected over Ethernet or CAN bus). The SoC(s) 704may further include dedicated high-performance mass storage controllersthat may include their own DMA engines, and that may be used to free theCPU(s) 706 from routine data management tasks.

The SoC(s) 704 may be an end-to-end platform with a flexiblearchitecture that spans automation levels 3-5, thereby providing acomprehensive functional safety architecture that leverages and makesefficient use of computer vision and ADAS techniques for diversity andredundancy, provides a platform for a flexible, reliable drivingsoftware stack, along with deep learning tools. The SoC(s) 704 may befaster, more reliable, and even more energy-efficient andspace-efficient than conventional systems. For example, theaccelerator(s) 714, when combined with the CPU(s) 706, the GPU(s) 708,and the data store(s) 716, may provide for a fast, efficient platformfor level 3-5 autonomous vehicles.

The technology thus provides capabilities and functionality that cannotbe achieved by conventional systems. For example, computer visionalgorithms may be executed on CPUs, which may be configured usinghigh-level programming language, such as the C programming language, toexecute a wide variety of processing algorithms across a wide variety ofvisual data. However, CPUs are oftentimes unable to meet the performancerequirements of many computer vision applications, such as those relatedto execution time and power consumption, for example. In particular,many CPUs are unable to execute complex object detection algorithms inreal-time, which is a requirement of in-vehicle ADAS applications, and arequirement for practical Level 3-5 autonomous vehicles.

In contrast to conventional systems, by providing a CPU complex, GPUcomplex, and a hardware acceleration cluster, the technology describedherein allows for multiple neural networks to be performedsimultaneously and/or sequentially, and for the results to be combinedtogether to enable Level 3-5 autonomous driving functionality. Forexample, a CNN executing on the DLA or dGPU (e.g., the GPU(s) 720) mayinclude a text and word recognition, allowing the supercomputer to readand understand traffic signs, including signs for which the neuralnetwork has not been specifically trained. The DLA may further include aneural network that is able to identify, interpret, and providessemantic understanding of the sign, and to pass that semanticunderstanding to the path planning modules running on the CPU Complex.

As another example, multiple neural networks may be run simultaneously,as is required for Level 3, 4, or 5 driving. For example, a warning signconsisting of “Caution: flashing lights indicate icy conditions,” alongwith an electric light, may be independently or collectively interpretedby several neural networks. The sign itself may be identified as atraffic sign by a first deployed neural network (e.g., a neural networkthat has been trained), the text “Flashing lights indicate icyconditions” may be interpreted by a second deployed neural network,which informs the vehicle’s path planning software (preferably executingon the CPU Complex) that when flashing lights are detected, icyconditions exist. The flashing light may be identified by operating athird deployed neural network over multiple frames, informing thevehicle’s path-planning software of the presence (or absence) offlashing lights. All three neural networks may run simultaneously, suchas within the DLA and/or on the GPU(s) 708.

In some examples, a CNN for facial recognition and vehicle owneridentification may use data from camera sensors to identify the presenceof an authorized driver and/or owner of the vehicle 102. The always onsensor processing engine may be used to unlock the vehicle when theowner approaches the driver door and turn on the lights, and, insecurity mode, to disable the vehicle when the owner leaves the vehicle.In this way, the SoC(s) 704 provide for security against theft and/orcarjacking.

In another example, a CNN for emergency vehicle detection andidentification may use data from microphones 796 to detect and identifyemergency vehicle sirens. In contrast to conventional systems, that usegeneral classifiers to detect sirens and manually extract features, theSoC(s) 704 use the CNN for classifying environmental and urban sounds,as well as classifying visual data. In a preferred embodiment, the CNNrunning on the DLA is trained to identify the relative closing speed ofthe emergency vehicle (e.g., by using the Doppler effect). The CNN mayalso be trained to identify emergency vehicles specific to the localarea in which the vehicle is operating, as identified by GNSS sensor(s)758. Thus, for example, when operating in Europe the CNN will seek todetect European sirens, and when in the United States the CNN will seekto identify only North American sirens. Once an emergency vehicle isdetected, a control program may be used to execute an emergency vehiclesafety routine, slowing the vehicle, pulling over to the side of theroad, parking the vehicle, and/or idling the vehicle, with theassistance of ultrasonic sensors 762, until the emergency vehicle(s)passes.

The vehicle may include a CPU(s) 718 (e.g., discrete CPU(s), ordCPU(s)), that may be coupled to the SoC(s) 704 via a high-speedinterconnect (e.g., PCIe). The CPU(s) 718 may include an X86 processor,for example. The CPU(s) 718 may be used to perform any of a variety offunctions, including arbitrating potentially inconsistent resultsbetween ADAS sensors and the SoC(s) 704, and/or monitoring the statusand health of the controller(s) 736 and/or infotainment SoC 730, forexample.

The vehicle 102 may include a GPU(s) 720 (e.g., discrete GPU(s), ordGPU(s)), that may be coupled to the SoC(s) 704 via a high-speedinterconnect (e.g., NVIDIA’s NVLINK). The GPU(s) 720 may provideadditional artificial intelligence functionality, such as by executingredundant and/or different neural networks, and may be used to trainand/or update neural networks based on input (e.g., sensor data) fromsensors of the vehicle 102.

The vehicle 102 may further include the network interface 724 which mayinclude one or more wireless antennas 726 (e.g., one or more wirelessantennas for different communication protocols, such as a cellularantenna, a Bluetooth antenna, etc.). The network interface 724 may beused to enable wireless connectivity over the Internet with the cloud(e.g., with the server(s) 778 and/or other network devices), with othervehicles, and/or with computing devices (e.g., client devices ofpassengers). To communicate with other vehicles, a direct link may beestablished between the two vehicles and/or an indirect link may beestablished (e.g., across networks and over the Internet). Direct linksmay be provided using a vehicle-to-vehicle communication link. Thevehicle-to-vehicle communication link may provide the vehicle 102information about vehicles in proximity to the vehicle 102 (e.g.,vehicles in front of, on the side of, and/or behind the vehicle 102).This functionality may be part of a cooperative adaptive cruise controlfunctionality of the vehicle 102.

The network interface 724 may include a SoC that provides modulation anddemodulation functionality and enables the controller(s) 736 tocommunicate over wireless networks. The network interface 724 mayinclude a radio frequency front-end for up-conversion from baseband toradio frequency, and down conversion from radio frequency to baseband.The frequency conversions may be performed through well-known processes,and/or may be performed using super-heterodyne processes. In someexamples, the radio frequency front end functionality may be provided bya separate chip. The network interface may include wirelessfunctionality for communicating over LTE, WCDMA, UMTS, GSM, CDMA2000,Bluetooth, Bluetooth LE, Wi-Fi, Z-Wave, ZigBee, LoRaWAN, and/or otherwireless protocols.

The vehicle 102 may further include data store(s) 728 which may includeoff-chip (e.g., off the SoC(s) 704) storage. The data store(s) 728 mayinclude one or more storage elements including RAM, SRAM, DRAM, VRAM,Flash, hard disks, and/or other components and/or devices that may storeat least one bit of data.

The vehicle 102 may further include GNSS sensor(s) 758. The GNSSsensor(s) 758 (e.g., GPS and/or assisted GPS sensors), to assist inmapping, perception, occupancy grid generation, and/or path planningfunctions. Any number of GNSS sensor(s) 758 may be used, including, forexample and without limitation, a GPS using a USB connector with anEthernet to Serial (RS-232) bridge.

The vehicle 102 may further include RADAR sensor(s) 760. The RADARsensor(s) 760 may be used by the vehicle 102 for long-range vehicledetection, even in darkness and/or severe weather conditions. RADARfunctional safety levels may be ASIL B. The RADAR sensor(s) 760 may usethe CAN and/or the bus 702 (e.g., to transmit data generated by theRADAR sensor(s) 760) for control and to access object tracking data,with access to Ethernet to access raw data in some examples. A widevariety of RADAR sensor types may be used. For example, and withoutlimitation, the RADAR sensor(s) 760 may be suitable for front, rear, andside RADAR use. In some example, Pulse Doppler RADAR sensor(s) are used.

The RADAR sensor(s) 760 may include different configurations, such aslong range with narrow field of view, short range with wide field ofview, short range side coverage, etc. In some examples, long-range RADARmay be used for adaptive cruise control functionality. The long-rangeRADAR systems may provide a broad field of view realized by two or moreindependent scans, such as within a 250 m range. The RADAR sensor(s) 760may help in distinguishing between static and moving objects, and may beused by ADAS systems for emergency brake assist and forward collisionwarning. Long-range RADAR sensors may include monostatic multimodalRADAR with multiple (e.g., six or more) fixed RADAR antennae and ahigh-speed CAN and FlexRay interface. In an example with six antennae,the central four antennae may create a focused beam pattern, designed torecord the vehicle’s 102 surroundings at higher speeds with minimalinterference from traffic in adjacent lanes. The other two antennae mayexpand the field of view, making it possible to quickly detect vehiclesentering or leaving the vehicle’s 102 lane.

Mid-range RADAR systems may include, as an example, a range of up to 760m (front) or 80 m (rear), and a field of view of up to 42 degrees(front) or 750 degrees (rear). Short-range RADAR systems may include,without limitation, RADAR sensors designed to be installed at both endsof the rear bumper. When installed at both ends of the rear bumper, sucha RADAR sensor systems may create two beams that constantly monitor theblind spot in the rear and next to the vehicle.

Short-range RADAR systems may be used in an ADAS system for blind spotdetection and/or lane change assist.

The vehicle 102 may further include ultrasonic sensor(s) 762. Theultrasonic sensor(s) 762, which may be positioned at the front, back,and/or the sides of the vehicle 102, may be used for park assist and/orto create and update an occupancy grid. A wide variety of ultrasonicsensor(s) 762 may be used, and different ultrasonic sensor(s) 762 may beused for different ranges of detection (e.g., 2.5 m, 4 m). Theultrasonic sensor(s) 762 may operate at functional safety levels of ASILB.

The vehicle 102 may include LIDAR sensor(s) 764. The LIDAR sensor(s) 764may be used for object and pedestrian detection, emergency braking,collision avoidance, and/or other functions. The LIDAR sensor(s) 764 maybe functional safety level ASIL B. In some examples, the vehicle 102 mayinclude multiple LIDAR sensors 764 (e.g., two, four, six, etc.) that mayuse Ethernet (e.g., to provide data to a Gigabit Ethernet switch).

In some examples, the LIDAR sensor(s) 764 may be capable of providing alist of objects and their distances for a 360-degree field of view.Commercially available LIDAR sensor(s) 764 may have an advertised rangeof approximately 102 m, with an accuracy of 2 cm-3 cm, and with supportfor a 102 Mbps Ethernet connection, for example. In some examples, oneor more non-protruding LIDAR sensors 764 may be used. In such examples,the LIDAR sensor(s) 764 may be implemented as a small device that may beembedded into the front, rear, sides, and/or corners of the vehicle 102.The LIDAR sensor(s) 764, in such examples, may provide up to a720-degree horizontal and 35-degree vertical field-of-view, with a 200 mrange even for low-reflectivity objects. Front-mounted LIDAR sensor(s)764 may be configured for a horizontal field of view between 45 degreesand 135 degrees.

In some examples, LIDAR technologies, such as 3D flash LIDAR, may alsobe used. 3D Flash LIDAR uses a flash of a laser as a transmissionsource, to illuminate vehicle surroundings up to approximately 200 m. Aflash LIDAR unit includes a receptor, which records the laser pulsetransit time and the reflected light on each pixel, which in turncorresponds to the range from the vehicle to the objects. Flash LIDARmay allow for highly accurate and distortion-free images of thesurroundings to be generated with every laser flash. In some examples,four flash LIDAR sensors may be deployed, one at each side of thevehicle 102. Available 3D flash LIDAR systems include a solid-state 3Dstaring array LIDAR camera with no moving parts other than a fan (e.g.,a non-scanning LIDAR device). The flash LIDAR device may use a 5nanosecond class I (eye-safe) laser pulse per frame and may capture thereflected laser light in the form of 3D range point clouds andco-registered intensity data. By using flash LIDAR, and because flashLIDAR is a solid-state device with no moving parts, the LIDAR sensor(s)764 may be less susceptible to motion blur, vibration, and/or shock.

The vehicle may further include IMU sensor(s) 766. The IMU sensor(s) 766may be located at a center of the rear axle of the vehicle 102, in someexamples. The IMU sensor(s) 766 may include, for example and withoutlimitation, an accelerometer(s), a magnetometer(s), a gyroscope(s), amagnetic compass(es), and/or other sensor types. In some examples, suchas in six-axis applications, the IMU sensor(s) 766 may includeaccelerometers and gyroscopes, while in nine-axis applications, the IMUsensor(s) 766 may include accelerometers, gyroscopes, and magnetometers.

In some embodiments, the IMU sensor(s) 766 may be implemented as aminiature, high performance GPS-Aided Inertial Navigation System(GPS/INS) that combines micro-electromechanical systems (MEMS) inertialsensors, a high-sensitivity GPS receiver, and advanced Kalman filteringalgorithms to provide estimates of position, velocity, and attitude. Assuch, in some examples, the IMU sensor(s) 766 may enable the vehicle 102to estimate heading without requiring input from a magnetic sensor bydirectly observing and correlating the changes in velocity from GPS tothe IMU sensor(s) 766. In some examples, the IMU sensor(s) 766 and theGNSS sensor(s) 758 may be combined in a single integrated unit.

The vehicle may include microphone(s) 796 placed in and/or around thevehicle 102. The microphone(s) 796 may be used for emergency vehicledetection and identification, among other things.

The vehicle may further include any number of camera types, includingstereo camera(s) 768, wide-view camera(s) 770, infrared camera(s) 772,surround camera(s) 774, long-range and/or mid-range camera(s) 798,and/or other camera types. The cameras may be used to capture image dataaround an entire periphery of the vehicle 102. The types of cameras useddepends on the embodiments and requirements for the vehicle 102, and anycombination of camera types may be used to provide the necessarycoverage around the vehicle 102. In addition, the number of cameras maydiffer depending on the embodiment. For example, the vehicle may includesix cameras, seven cameras, ten cameras, twelve cameras, and/or anothernumber of cameras. The cameras may support, as an example and withoutlimitation, Gigabit Multimedia Serial Link (GMSL) and/or GigabitEthernet. Each of the camera(s) is described with more detail hereinwith respect to FIG. 7A and FIG. 7B.

The vehicle 102 may further include vibration sensor(s) 742. Thevibration sensor(s) 742 may measure vibrations of components of thevehicle, such as the axle(s). For example, changes in vibrations mayindicate a change in road surfaces. In another example, when two or morevibration sensors 742 are used, the differences between the vibrationsmay be used to determine friction or slippage of the road surface (e.g.,when the difference in vibration is between a power-driven axle and afreely rotating axle).

The vehicle 102 may include an ADAS system 738. The ADAS system 738 mayinclude a SoC, in some examples. The ADAS system 738 may includeautonomous/adaptive/automatic cruise control (ACC), cooperative adaptivecruise control (CACC), forward crash warning (FCW), automatic emergencybraking (AEB), lane departure warnings (LDW), lane keep assist (LKA),blind spot warning (BSW), rear cross-traffic warning (RCTW), collisionwarning systems (CWS), lane centering (LC), and/or other features andfunctionality.

The ACC systems may use RADAR sensor(s) 760, LIDAR sensor(s) 764, and/ora camera(s). The ACC systems may include longitudinal ACC and/or lateralACC. Longitudinal ACC monitors and controls the distance to the vehicleimmediately ahead of the vehicle 102 and automatically adjust thevehicle speed to maintain a safe distance from vehicles ahead. LateralACC performs distance keeping, and advises the vehicle 102 to changelanes when necessary. Lateral ACC is related to other ADAS applicationssuch as LCA and CWS.

CACC uses information from other vehicles that may be received via thenetwork interface 724 and/or the wireless antenna(s) 726 from othervehicles via a wireless link, or indirectly, over a network connection(e.g., over the Internet). Direct links may be provided by avehicle-to-vehicle (V2V) communication link, while indirect links may beinfrastructure-to-vehicle (I2V) communication link. In general, the V2Vcommunication concept provides information about the immediatelypreceding vehicles (e.g., vehicles immediately ahead of and in the samelane as the vehicle 102), while the I2V communication concept providesinformation about traffic further ahead. CACC systems may include eitheror both I2V and V2V information sources. Given the information of thevehicles ahead of the vehicle 102, CACC may be more reliable and it haspotential to improve traffic flow smoothness and reduce congestion onthe road.

FCW systems are designed to alert the driver to a hazard, so that thedriver may take corrective action. FCW systems use a front-facing cameraand/or RADAR sensor(s) 760, coupled to a dedicated processor, DSP, FPGA,and/or ASIC, that is electrically coupled to driver feedback, such as adisplay, speaker, and/or vibrating component. FCW systems may provide awarning, such as in the form of a sound, visual warning, vibrationand/or a quick brake pulse.

AEB systems detect an impending forward collision with another vehicleor other object, and may automatically apply the brakes if the driverdoes not take corrective action within a specified time or distanceparameter. AEB systems may use front-facing camera(s) and/or RADARsensor(s) 760, coupled to a dedicated processor, DSP, FPGA, and/or ASIC.When the AEB system detects a hazard, it typically first alerts thedriver to take corrective action to avoid the collision and, if thedriver does not take corrective action, the AEB system may automaticallyapply the brakes in an effort to prevent, or at least mitigate, theimpact of the predicted collision. AEB systems, may include techniquessuch as dynamic brake support and/or crash imminent braking.

LDW systems provide visual, audible, and/or tactile warnings, such assteering wheel or seat vibrations, to alert the driver when the vehicle102 crosses lane markings. A LDW system does not activate when thedriver indicates an intentional lane departure, by activating a turnsignal. LDW systems may use front-side facing cameras, coupled to adedicated processor, DSP, FPGA, and/or ASIC, that is electricallycoupled to driver feedback, such as a display, speaker, and/or vibratingcomponent.

LKA systems are a variation of LDW systems. LKA systems provide steeringinput or braking to correct the vehicle 102 if the vehicle 102 starts toexit the lane.

BSW systems detects and warn the driver of vehicles in an automobile’sblind spot. BSW systems may provide a visual, audible, and/or tactilealert to indicate that merging or changing lanes is unsafe. The systemmay provide an additional warning when the driver uses a turn signal.BSW systems may use rear-side facing camera(s) and/or RADAR sensor(s)760, coupled to a dedicated processor, DSP, FPGA, and/or ASIC, that iselectrically coupled to driver feedback, such as a display, speaker,and/or vibrating component.

RCTW systems may provide visual, audible, and/or tactile notificationwhen an object is detected outside the rear-camera range when thevehicle 102 is backing up. Some RCTW systems include AEB to ensure thatthe vehicle brakes are applied to avoid a crash. RCTW systems may useone or more rear-facing RADAR sensor(s) 760, coupled to a dedicatedprocessor, DSP, FPGA, and/or ASIC, that is electrically coupled todriver feedback, such as a display, speaker, and/or vibrating component.

Conventional ADAS systems may be prone to false positive results whichmay be annoying and distracting to a driver, but typically are notcatastrophic, because the ADAS systems alert the driver and allow thedriver to decide whether a safety condition truly exists and actaccordingly. However, in an autonomous vehicle 102, the vehicle 102itself must, in the case of conflicting results, decide whether to heedthe result from a primary computer or a secondary computer (e.g., afirst controller 736 or a second controller 736). For example, in someembodiments, the ADAS system 738 may be a backup and/or secondarycomputer for providing perception information to a backup computerrationality module. The backup computer rationality monitor may run aredundant diverse software on hardware components to detect faults inperception and dynamic driving tasks. Outputs from the ADAS system 738may be provided to a supervisory MCU. If outputs from the primarycomputer and the secondary computer conflict, the supervisory MCU mustdetermine how to reconcile the conflict to ensure safe operation.

In some examples, the primary computer may be configured to provide thesupervisory MCU with a confidence score, indicating the primarycomputer’s confidence in the chosen result. If the confidence scoreexceeds a threshold, the supervisory MCU may follow the primarycomputer’s direction, regardless of whether the secondary computerprovides a conflicting or inconsistent result. Where the confidencescore does not meet the threshold, and where the primary and secondarycomputer indicate different results (e.g., the conflict), thesupervisory MCU may arbitrate between the computers to determine theappropriate outcome.

The supervisory MCU may be configured to run a neural network(s) that istrained and configured to determine, based on outputs from the primarycomputer and the secondary computer, conditions under which thesecondary computer provides false alarms. Thus, the neural network(s) inthe supervisory MCU may learn when the secondary computer’s output maybe trusted, and when it cannot. For example, when the secondary computeris a RADAR-based FCW system, a neural network(s) in the supervisory MCUmay learn when the FCW system is identifying metallic objects that arenot, in fact, hazards, such as a drainage grate or manhole cover thattriggers an alarm. Similarly, when the secondary computer is acamera-based LDW system, a neural network in the supervisory MCU maylearn to override the LDW when bicyclists or pedestrians are present anda lane departure is, in fact, the safest maneuver. In embodiments thatinclude a neural network(s) running on the supervisory MCU, thesupervisory MCU may include at least one of a DLA or GPU suitable forrunning the neural network(s) with associated memory. In preferredembodiments, the supervisory MCU may comprise and/or be included as acomponent of the SoC(s) 704.

In other examples, ADAS system 738 may include a secondary computer thatperforms ADAS functionality using traditional rules of computer vision.As such, the secondary computer may use classic computer vision rules(if-then), and the presence of a neural network(s) in the supervisoryMCU may improve reliability, safety and performance. For example, thediverse implementation and intentional non-identity makes the overallsystem more fault-tolerant, especially to faults caused by software (orsoftware-hardware interface) functionality. For example, if there is asoftware bug or error in the software running on the primary computer,and the non-identical software code running on the secondary computerprovides the same overall result, the supervisory MCU may have greaterconfidence that the overall result is correct, and the bug in softwareor hardware on primary computer is not causing material error.

In some examples, the output of the ADAS system 738 may be fed into theprimary computer’s perception block and/or the primary computer’sdynamic driving task block. For example, if the ADAS system 738indicates a forward crash warning due to an object immediately ahead,the perception block may use this information when identifying objects.In other examples, the secondary computer may have its own neuralnetwork which is trained and thus reduces the risk of false positives,as described herein.

The vehicle 102 may further include the infotainment SoC 730 (e.g., anin-vehicle infotainment system (IVI)). Although illustrated anddescribed as a SoC, the infotainment system may not be a SoC, and mayinclude two or more discrete components. The infotainment SoC 730 mayinclude a combination of hardware and software that may be used toprovide audio (e.g., music, a personal digital assistant, navigationalinstructions, news, radio, etc.), video (e.g., TV, movies, streaming,etc.), phone (e.g., hands-free calling), network connectivity (e.g.,LTE, Wi-Fi, etc.), and/or information services (e.g., navigationsystems, rear-parking assistance, a radio data system, vehicle relatedinformation such as fuel level, total distance covered, brake fuellevel, oil level, door open/close, air filter information, etc.) to thevehicle 102. For example, the infotainment SoC 730 may radios, diskplayers, navigation systems, video players, USB and Bluetoothconnectivity, carputers, in-car entertainment, Wi-Fi, steering wheelaudio controls, hands free voice control, a heads-up display (HUD), anHMI display 734, a telematics device, a control panel (e.g., forcontrolling and/or interacting with various components, features, and/orsystems), and/or other components. The infotainment SoC 730 may furtherbe used to provide information (e.g., visual and/or audible) to auser(s) of the vehicle, such as information from the ADAS system 738,autonomous driving information such as planned vehicle maneuvers,trajectories, surrounding environment information (e.g., intersectioninformation, vehicle information, road information, etc.), and/or otherinformation.

The infotainment SoC 730 may include GPU functionality. The infotainmentSoC 730 may communicate over the bus 702 (e.g., CAN bus, Ethernet, etc.)with other devices, systems, and/or components of the vehicle 102. Insome examples, the infotainment SoC 730 may be coupled to a supervisoryMCU such that the GPU of the infotainment system may perform someself-driving functions in the event that the primary controller(s) 736(e.g., the primary and/or backup computers of the vehicle 102) fail. Insuch an example, the infotainment SoC 730 may put the vehicle 102 into achauffeur to safe stop mode, as described herein.

The vehicle 102 may further include an instrument cluster 732 (e.g., adigital dash, an electronic instrument cluster, a digital instrumentpanel, etc.). The instrument cluster 732 may include a controller and/orsupercomputer (e.g., a discrete controller or supercomputer). Theinstrument cluster 732 may include a set of instrumentation such as aspeedometer, fuel level, oil pressure, tachometer, odometer, turnindicators, gearshift position indicator, seat belt warning light(s),parking-brake warning light(s), engine-malfunction light(s), airbag(SRS) system information, lighting controls, safety system controls,navigation information, etc. In some examples, information may bedisplayed and/or shared among the infotainment SoC 730 and theinstrument cluster 732. In other words, the instrument cluster 732 maybe included as part of the infotainment SoC 730, or vice versa.

FIG. 7D is a system diagram for communication between cloud-basedserver(s) and the example autonomous vehicle 102 of FIG. 7A, inaccordance with some embodiments of the present disclosure. The system776 may include server(s) 778, network(s) 104, and vehicles, includingthe vehicle 102. The server(s) 778 may include a plurality of GPUs784(A)-784(H) (collectively referred to herein as GPUs 784), PCIeswitches 782(A)-782(H) (collectively referred to herein as PCIe switches782), and/or CPUs 780(A)-780(B) (collectively referred to herein as CPUs780). The GPUs 784, the CPUs 780, and the PCIe switches may beinterconnected with high-speed interconnects such as, for example andwithout limitation, NVLink interfaces 788 developed by NVIDIA and/orPCIe connections 786. In some examples, the GPUs 784 are connected viaNVLink and/or NVSwitch SoC and the GPUs 784 and the PCIe switches 782are connected via PCIe interconnects. Although eight GPUs 784, two CPUs780, and two PCIe switches are illustrated, this is not intended to belimiting. Depending on the embodiment, each of the server(s) 778 mayinclude any number of GPUs 784, CPUs 780, and/or PCIe switches. Forexample, the server(s) 778 may each include eight, sixteen, thirty-two,and/or more GPUs 784.

The server(s) 778 may receive, over the network(s) 104 and from thevehicles, image data representative of images showing unexpected orchanged road conditions, such as recently commenced road-work. Theserver(s) 778 may transmit, over the network(s) 104 and to the vehicles,neural networks 792, updated neural networks 792, and/or map information794, including information regarding traffic and road conditions. Theupdates to the map information 794 may include updates for the HD map722, such as information regarding construction sites, potholes,detours, flooding, and/or other obstructions. In some examples, theneural networks 792, the updated neural networks 792, and/or the mapinformation 794 may have resulted from new training and/or experiencesrepresented in data received from any number of vehicles in theenvironment, and/or based on training performed at a datacenter (e.g.,using the server(s) 778 and/or other servers).

The server(s) 778 may be used to train machine learning models (e.g.,neural networks) based on training data. The training data may begenerated by the vehicles, and/or may be generated in a simulation(e.g., using a game engine). In some examples, the training data istagged (e.g., where the neural network benefits from supervisedlearning) and/or undergoes other pre-processing, while in other examplesthe training data is not tagged and/or pre-processed (e.g., where theneural network does not require supervised learning). Once the machinelearning models are trained, the machine learning models may be used bythe vehicles (e.g., transmitted to the vehicles over the network(s) 104,and/or the machine learning models may be used by the server(s) 778 toremotely monitor the vehicles.

In some examples, the server(s) 778 may receive data from the vehiclesand apply the data to up-to-date real-time neural networks for real-timeintelligent inferencing. The server(s) 778 may include deep-learningsupercomputers and/or dedicated AI computers powered by GPU(s) 784, suchas a DGX and DGX Station machines developed by NVIDIA. However, in someexamples, the server(s) 778 may include deep learning infrastructurethat use only CPU-powered datacenters.

The deep-learning infrastructure of the server(s) 778 may be capable offast, real-time inferencing, and may use that capability to evaluate andverify the health of the processors, software, and/or associatedhardware in the vehicle 102. For example, the deep-learninginfrastructure may receive periodic updates from the vehicle 102, suchas a sequence of images and/or objects that the vehicle 102 has locatedin that sequence of images (e.g., via computer vision and/or othermachine learning object classification techniques). The deep-learninginfrastructure may run its own neural network to identify the objectsand compare them with the objects identified by the vehicle 102 and, ifthe results do not match and the infrastructure concludes that the AI inthe vehicle 102 is malfunctioning, the server(s) 778 may transmit asignal to the vehicle 102 instructing a fail-safe computer of thevehicle 102 to assume control, notify the passengers, and complete asafe parking maneuver.

For inferencing, the server(s) 778 may include the GPU(s) 784 and one ormore programmable inference accelerators (e.g., NVIDIA’s TensorRT 3).The combination of GPU-powered servers and inference acceleration maymake real-time responsiveness possible. In other examples, such as whereperformance is less critical, servers powered by CPUs, FPGAs, and otherprocessors may be used for inferencing.

Example Computing Device

FIG. 8 is a block diagram of an example computing device 800 suitablefor use in implementing some embodiments of the present disclosure.Computing device 800 may include a bus 802 that directly or indirectlycouples the following devices: memory 804, one or more centralprocessing units (CPUs) 806, one or more graphics processing units(GPUs) 808, a communication interface 810, input/output (I/O) ports 812,input/output components 814, a power supply 816, and one or morepresentation components 818 (e.g., display(s)).

Although the various blocks of FIG. 8 are shown as connected via the bus802 with lines, this is not intended to be limiting and is for clarityonly. For example, in some embodiments, a presentation component 818,such as a display device, may be considered an I/O component 814 (e.g.,if the display is a touch screen). As another example, the CPUs 806and/or GPUs 808 may include memory (e.g., the memory 804 may berepresentative of a storage device in addition to the memory of the GPUs808, the CPUs 806, and/or other components). In other words, thecomputing device of FIG. 8 is merely illustrative. Distinction is notmade between such categories as “workstation,” “server,” “laptop,”“desktop,” “tablet,” “client device,” “mobile device,” “handhelddevice,” “game console,” “electronic control unit (ECU),” “virtualreality system,” and/or other device or system types, as all arecontemplated within the scope of the computing device of FIG. 8 .

The bus 802 may represent one or more busses, such as an address bus, adata bus, a control bus, or a combination thereof. The bus 802 mayinclude one or more bus types, such as an industry standard architecture(ISA) bus, an extended industry standard architecture (EISA) bus, avideo electronics standards association (VESA) bus, a peripheralcomponent interconnect (PCI) bus, a peripheral component interconnectexpress (PCIe) bus, and/or another type of bus.

The memory 804 may include any of a variety of computer-readable media.The computer-readable media may be any available media that may beaccessed by the computing device 800. The computer-readable media mayinclude both volatile and nonvolatile media, and removable andnon-removable media. By way of example, and not limitation, thecomputer-readable media may comprise computer-storage media andcommunication media.

The computer-storage media may include both volatile and nonvolatilemedia and/or removable and non-removable media implemented in any methodor technology for storage of information such as computer-readableinstructions, data structures, program modules, and/or other data types.For example, the memory 804 may store computer-readable instructions(e.g., that represent a program(s) and/or a program element(s), such asan operating system. Computer-storage media may include, but is notlimited to, RAM, ROM, EEPROM, flash memory or other memory technology,CD-ROM, digital versatile disks (DVD) or other optical disk storage,magnetic cassettes, magnetic tape, magnetic disk storage or othermagnetic storage devices, or any other medium which may be used to storethe desired information and which may be accessed by computing device800. As used herein, computer storage media does not comprise signalsper se.

The communication media may embody computer-readable instructions, datastructures, program modules, and/or other data types in a modulated datasignal such as a carrier wave or other transport mechanism and includesany information delivery media. The term “modulated data signal” mayrefer to a signal that has one or more of its characteristics set orchanged in such a manner as to encode information in the signal. By wayof example, and not limitation, the communication media may includewired media such as a wired network or direct-wired connection, andwireless media such as acoustic, RF, infrared and other wireless media.Combinations of any of the above should also be included within thescope of computer-readable media.

The CPU(s) 806 may be configured to execute the computer-readableinstructions to control one or more components of the computing device800 to perform one or more of the methods and/or processes describedherein. The CPU(s) 806 may each include one or more cores (e.g., one,two, four, eight, twenty-eight, seventy-two, etc.) that are capable ofhandling a multitude of software threads simultaneously. The CPU(s) 806may include any type of processor, and may include different types ofprocessors depending on the type of computing device 800 implemented(e.g., processors with fewer cores for mobile devices and processorswith more cores for servers). For example, depending on the type ofcomputing device 800, the processor may be an ARM processor implementedusing Reduced Instruction Set Computing (RISC) or an x86 processorimplemented using Complex Instruction Set Computing (CISC). Thecomputing device 800 may include one or more CPUs 806 in addition to oneor more microprocessors or supplementary co-processors, such as mathco-processors.

The GPU(s) 808 may be used by the computing device 800 to rendergraphics (e.g., 3D graphics). The GPU(s) 808 may include hundreds orthousands of cores that are capable of handling hundreds or thousands ofsoftware threads simultaneously. The GPU(s) 808 may generate pixel datafor output images in response to rendering commands (e.g., renderingcommands from the CPU(s) 806 received via a host interface). The GPU(s)808 may include graphics memory, such as display memory, for storingpixel data. The display memory may be included as part of the memory804. The GPU(s) 708 may include two or more GPUs operating in parallel(e.g., via a link). When combined together, each GPU 808 may generatepixel data for different portions of an output image or for differentoutput images (e.g., a first GPU for a first image and a second GPU fora second image). Each GPU may include its own memory, or may sharememory with other GPUs.

In examples where the computing device 800 does not include the GPU(s)808, the CPU(s) 806 may be used to render graphics.

The communication interface 810 may include one or more receivers,transmitters, and/or transceivers that enable the computing device 700to communicate with other computing devices via an electroniccommunication network, included wired and/or wireless communications.The communication interface 810 may include components and functionalityto enable communication over any of a number of different networks, suchas wireless networks (e.g., Wi-Fi, Z-Wave, Bluetooth, Bluetooth LE,ZigBee, etc.), wired networks (e.g., communicating over Ethernet),low-power wide-area networks (e.g., LoRaWAN, SigFox, etc.), and/or theInternet.

The I/O ports 812 may enable the computing device 800 to be logicallycoupled to other devices including the I/O components 814, thepresentation component(s) 818, and/or other components, some of whichmay be built in to (e.g., integrated in) the computing device 800.Illustrative I/O components 814 include a microphone, mouse, keyboard,joystick, game pad, game controller, satellite dish, scanner, printer,wireless device, etc. The I/O components 814 may provide a natural userinterface (NUI) that processes air gestures, voice, or otherphysiological inputs generated by a user. In some instances, inputs maybe transmitted to an appropriate network element for further processing.An NUI may implement any combination of speech recognition, stylusrecognition, facial recognition, biometric recognition, gesturerecognition both on screen and adjacent to the screen, air gestures,head and eye tracking, and touch recognition (as described in moredetail below) associated with a display of the computing device 800. Thecomputing device 800 may be include depth cameras, such as stereoscopiccamera systems, infrared camera systems, RGB camera systems, touchscreentechnology, and combinations of these, for gesture detection andrecognition. Additionally, the computing device 800 may includeaccelerometers or gyroscopes (e.g., as part of an inertia measurementunit (IMU)) that enable detection of motion. In some examples, theoutput of the accelerometers or gyroscopes may be used by the computingdevice 800 to render immersive augmented reality or virtual reality.

The power supply 816 may include a hard-wired power supply, a batterypower supply, or a combination thereof. The power supply 816 may providepower to the computing device 800 to enable the components of thecomputing device 800 to operate.

The presentation component(s) 818 may include a display (e.g., amonitor, a touch screen, a television screen, a heads-up-display (HUD),other display types, or a combination thereof), speakers, and/or otherpresentation components. The presentation component(s) 818 may receivedata from other components (e.g., the GPU(s) 808, the CPU(s) 806, etc.),and output the data (e.g., as an image, video, sound, etc.).

The disclosure may be described in the general context of computer codeor machine-useable instructions, including computer-executableinstructions such as program modules, being executed by a computer orother machine, such as a personal data assistant or other handhelddevice. Generally, program modules including routines, programs,objects, components, data structures, etc., refer to code that performparticular tasks or implement particular abstract data types. Thedisclosure may be practiced in a variety of system configurations,including handheld devices, consumer electronics, general-purposecomputers, more specialty computing devices, etc. The disclosure mayalso be practiced in distributed computing environments where tasks areperformed by remote-processing devices that are linked through acommunications network.

As used herein, a recitation of “and/or” with respect to two or moreelements should be interpreted to mean only one element, or acombination of elements. For example, “element A, element B, and/orelement C” may include only element A, only element B, only element C,element A and element B, element A and element C, element B and elementC, or elements A, B, and C. In addition, “at least one of element A orelement B” may include at least one of element A, at least one ofelement B, or at least one of element A and at least one of element B.Further, “at least one of element A and element B” may include at leastone of element A, at least one of element B, or at least one of elementA and at least one of element B.

The subject matter of the present disclosure is described withspecificity herein to meet statutory requirements. However, thedescription itself is not intended to limit the scope of thisdisclosure. Rather, the inventors have contemplated that the claimedsubject matter might also be embodied in other ways, to includedifferent steps or combinations of steps similar to the ones describedin this document, in conjunction with other present or futuretechnologies. Moreover, although the terms “step” and/or “block” may beused herein to connote different elements of methods employed, the termsshould not be interpreted as implying any particular order among orbetween various steps herein disclosed unless and except when the orderof individual steps is explicitly described.

What is claimed is:
 1. A method comprising: obtaining, using one or moresensors of a machine, state data representative of one or more states ofone or more components of the machine; sending, using the machine and toa remote system, the state data; receiving, using the machine and fromthe remote system, control data representative of one or more controlsfor the machine; and causing, based at least on the state data and thecontrol data, the machine to perform one or more operations.
 2. Themethod of claim 1, wherein the sending of the state data causes theremote system to calibrate one or more control devices according to theone or more states of the one or more components.
 3. The method of claim1, further comprising: obtaining, using one or more image sensors of themachine, image data corresponding to at least a portion of anenvironment of the machine; and sending the image data to the remotesystem to cause the remote system to present, based at least on theimage data, a display of a representation of at least the portion of theenvironment.
 4. The method of claim 1, wherein: the state datarepresents at least one or more first values associated with the one ormore states of the one or more components; and the control datarepresents one or more second values associated with the one or morecomponents, the one or more second values being based at least on theone or more first values.
 5. The method of claim 1, further comprising:determining, based at least on the state data and the control data, oneor more second controls for navigating the machine, wherein the causingthe machine to perform the one or more operations is based at least onthe one or more second controls.
 6. The method of claim 1, furthercomprising: generating calibration data associated with at least one ofthe one or more components of the machine or one or more secondcomponents of the machine; and sending the calibration data to theremote system.
 7. The method of claim 1, wherein the one or morecomponents include at least one of a steering wheel, a wheel, a gearcontrol, or a tire.
 8. The method of claim 1, wherein the state datarepresents at least one of a wheel angle associated with the machine, asteering wheel angle associated with the machine, a current gearassociated with the machine, or a tire pressure associated with themachine.
 9. A system comprising: one or more processing units to:determine, based at least on state data corresponding to one or morestates of one or more components of a machine, one or more calibrationparameters associated with one or more control devices, the one or morecontrol devices to remotely control, at least in part, the machine;generate, using the one or more control devices, control datarepresentative of one or more controls for the machine; and send thecontrol data to the machine to cause the machine to perform one or moreoperations based at least on the control data.
 10. The system of claim9, wherein the one or more processing units are further to calibrate,based at least on the one or more calibration parameters, the one ormore control devices to be associated with the one or more states of theone or more components.
 11. The system of claim 10, wherein thecalibration of the one or more control devices comprises at least one ofturning a steering device to an angle that is based at least on an angleof a steering wheel of the machine, adjusting a braking device based atleast on a braking associated with the machine, or adjusting anacceleration device based at least on an acceleration associated withthe machine.
 12. The system of claim 10, wherein the generation of thecontrol data representative of the one or more inputs comprisesgenerating one or more of: first control data representative of asteering control that is based at least on a first input to a steeringdevice as calibrated using the one or more calibration parameters;second control data representative of a braking control that is based atleast on a second input to a braking device as calibrated using the oneor more calibration parameters; or third control data representative ofan acceleration control that is based at least on a third input to abraking device as calibrated using the one or more calibrationparameters.
 13. The system of claim 9, wherein the one or moreprocessing units are further to: receive, from the machine, image datacorresponding to a real-world environment of the machine; generate,based at least on the image data, a virtual environment representationassociated with the real-world environment; and cause a display of thevirtual environment representation.
 14. The system of claim 9, whereinthe one or more processing units are further to: receive, from themachine, calibration data associated with the one or more components ofthe machine; and calibrate the one or more control devices based atleast on the calibration data.
 15. The system of claim 14, wherein theone or more control devices include one or more of a steering device, abraking device, or an acceleration device, and wherein the calibrationof the one or more control devices comprises calibrating one or more ofa sensitivity associated with the steering device, a sensitivityassociated with the braking device, or a sensitivity associated with theacceleration device.
 16. The system of claim 9, wherein the system iscomprised in at least one of: a control system for an autonomous orsemi-autonomous machine; a perception system for an autonomous orsemi-autonomous machine; a system for performing simulation operations;a system for performing deep learning operations; a system implementedusing a machine; a system for generating synthetic data; a systemincorporating one or more virtual machines (VMs); a system implementedat least partially in a data center; or a system implemented at leastpartially using cloud computing resources.
 17. A processor comprising:one or more processing units to cause a machine to perform one or moreoperations based at least on control data received from a remote system,wherein the control data is generated based at least on calibration datasent from the machine to the remote system, the calibration datarepresentative of one or more values for one or more calibrationparameters associated with one or more components of the machine. 18.The processor of claim 17, wherein the one or more processing units arefurther to: obtain, using one or more image sensors associated with themachine, image data corresponding to a real-world environment; and sendthe image data to the remote system, wherein the image data causes theremote system to generate a virtual environment representationassociated with the real-world environment.
 19. The processor of claim17, wherein the one or more processing units are further to: send, tothe remote system, state data representative of one or more states ofthe one or more components of the machine, wherein the control data isfurther based at least on the state data.
 20. The processor of claim 17,wherein the processor is comprised in at least one of: a control systemfor an autonomous or semi-autonomous machine; a perception system for anautonomous or semi-autonomous machine; a system for performingsimulation operations; a system for performing deep learning operations;a system implemented using a machine; a system for generating syntheticdata; a system incorporating one or more virtual machines (VMs); asystem implemented at least partially in a data center; or a systemimplemented at least partially using cloud computing resources.